• 0 Posts
  • 86 Comments
Joined 3 个月前
cake
Cake day: 2025年6月4日

help-circle


  • So like the size of a horse?

    The average horse is about half the height and weight of the average giraffe. Giraffes are just a really bad unit of measurement, males weight about 400kg more than females and there is a wide height difference over their global population, they are technically four different species we just all call giraffe 🦒



  • Why would you use a large language model to examine a biopsy?

    These should be specialized models trained off structured data sets, not the unbridled chaos of an LLM. They’re both called “AI”, but they’re wildly different technologies.

    It’s like criticizing a doctor for relying on an air conditioner to keep samples cool when I fact they used a freezer, simply because the mechanism of refrigeration is similar.











  • Mixture of experts has been in use since 1991, and it’s essentially just a way to split up the same process as a dense model.

    Tanks are an odd comparison, because not only have they changed radically since WW2, to the point that many crew positions have been entirely automated, but also because the role of tanks in modern combat has been radically altered since then (e.g. by the proliferation of drone warfare). They just look sort of similar because of basic geometry.

    Consider the current crop of LLMs as the armor that was deployed in WW1, we can see the promise and potential, but it has not yet been fully realized. If you tried to match a WW1 tank against a WW2 tank it would be no contest, and modern armor could destroy both of them with pinpoint accuracy while moving full speed over rough terrain outside of radar range (e.g. what happened in the invasion of Iraq).

    It will take many generational leaps across many diverse technologies to get from where we are now to realizing the full potential of large language models, and we can’t get there through simple linear progression any more than tanks could just keep adding thicker armor and bigger guns, it requires new technologies.


  • The gains in AI have been almost entirely in compute power and training, and those gains have run into powerful diminishing returns. At the core it’s all still running the same Markov chains as the machine learning experiments from the dawn of computing; the math is over a hundred years old and basically unchanged.

    For us to see another leap in progress we’ll need to pioneer new calculations and formulate different types of thought, then find a way to integrate that with large transformer networks.