Thinking specifically about AI here: if a process does not give a consistent or predictable output (and cannot reliably replace work done by humans) then can it really be considered “automation”?

  • ganymede@lemmy.ml
    link
    fedilink
    arrow-up
    2
    ·
    edit-2
    1 day ago

    good points on the training order!

    i was mostly thinking of intentionally introduced stochastic processes during training, eg. quantisation noise which is pretty broadband when uncorrelated, and even correlated from real-world datasets will inevitably contain non-determinism, though some contraints re. language “rules” could possibly shape that in interesting ways for LLMs.

    and especially the use of stochastic functions for convergence & stochastic rounding in quantisation etc. not to mention intentionally introduced randomisation in training set augmentation. so i think for most purposes, and with few exceptions they are mathematically definable as stochastic processes.

    where that overlaps with true theoretical determinism certainly becomes fuzzy without an exact context. afaict most kernel backed random seeds on x86 since 2015 with the RDSEED instruction, will have an asynchronous thermal noise based NIST 800-90B approved entropy source within the silicon and a NIST 800-90C Non-deterministic Random Bit Generator (NRBG).

    on other more probable architectures (GPU/TPU) I think that is going to be alot rarer and from a cryptographic perspective hardware implementations of even stochastic rounding are going to be a deterministic circuit under the hood for a while yet.

    but given the combination of overwhelming complexity, trade secrets and classical high entropy sources, I think most serious attempts at formal proofs would have to resign to stochastic terms in their formulation for some time yet.

    there may be some very specific and non-general exceptions, and i do believe this is going to change in the future as both extremes (highly formal AI models, and non-deterministic hardware backed instructions) are further developed. and ofc overcoming the computational resource hurdles for training could lead to relaxing some of the current practical requirements for stochastic processes during training.

    this is ofc only afaict, i don’t work in LLM field.

    • CanadaPlus@lemmy.sdf.org
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      16 hours ago

      In practice there really is no incentive to avoid stochastic or pseudorandom elements, so don’t hold your breath, haha. It’s a pretty academic question if you could theoretically train an LLM without any randomness.

      Thanks for writing that up, I learned a few things.

      • ganymede@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        30 minutes ago

        Exactly!

        Thanks for reading :) Realised i was going on a bit of a rant, but thought why not keep going lol