1. Can it mix multiple LLMs? 2. Can it self-improve and learn from mistakes? 3. Can it create playbooks, and then share them ala "borg"? 4. Can this system do research instead of SWE?