11/1/2025
wow i noticed that people are actually clicking on this blog... so there have been 200+ combined views across all posts... um oops. happy new month! your regular dose of random articles small samples can poison an llm of any size . we might be cooked! "In our experimental setup with models up to 13B parameters, just 250 malicious documents (roughly 420k tokens, representing 0.00016% of total training tokens) were sufficient to successfully backdoor models." still there's post-training and other defense mechanisms that makes this sort of poisoning impractical in real life. using ai to simulate cells . customizable to patient and can reduce expense of experimentation. previous non-ai models were very limited and required high compute. an early cell foundational model was called geneformer that apparently helped identify a molecule that could cure a type of heart disease which was confirmed by irl experiments. another model TranscriptFormer was really good at classifying d...