(Based) Simple linear attention language models balance the recall-throughput tradeoff