The expanding network of effective altruism within AI security

A few days back, a US AI policy expert mentioned to me: “Currently, if you’re not considering the influence of effective altruism (EA), you’re overlooking a significant part of the narrative.”

I must admit, to some extent, I overlooked the story last week.

Oddly enough, I thought the article I published last Friday was a surefire hit. It delved into why esteemed AI labs and respected think tanks are deeply concerned about securing LLM model weights. Seemed timely and clear-cut, at least in my view. Especially with the recent White House AI Executive Order mandating that foundation model companies furnish the federal government with details on “the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights.”

In my article, I spoke with Jason Clinton, Anthropic’s chief information security officer. He highlighted why safeguarding the model weights for Claude, Anthropic’s LLM, ranks as his topmost priority. The potential of opportunistic criminals, terrorist groups, or well-funded nation-state operations getting their hands on the weights of the most advanced LLMs is concerning. As he put it, “if an attacker gained access to the entire file, that’s the entire neural network.” Similar concerns are being echoed among other ‘frontier’ model companies — just recently, OpenAI’s new “Preparedness Framework” tackled the issue of “restricting access to critical know-how such as algorithmic secrets or model weights.”

I had discussions with Sella Nevo and Dan Lahav, both among the five co-authors responsible for a recent report from RAND Corporation, a highly regarded policy think tank, addressing the topic titled “Securing Artificial Intelligence Model Weights.” Nevo, identified in his bio as the director of RAND’s Meselson Center focused on mitigating risks from biological threats and emerging technologies, shared insights. He mentioned the plausible scenario that within the next two years, AI models might hold substantial national security significance, including the risk of malevolent entities misusing them for the development of biological weapons.

The intricate network of connections between effective altruism and AI security remained unexplored in my recent story. I overlooked the extensive ties from the effective altruism (EA) community within the rapidly evolving realms of AI security and its associated policy circles.

Curiously, despite my prior knowledge about Anthropic’s links to the movement—such as Bankman-Fried’s FTX holding a $500 million stake in the startup—I hadn’t delved into the EA realm. This gap in my coverage became evident when I came across a Politico article the day following my publication. It highlighted RAND Corporation researchers’ significant influence on the policy directives in the White House’s Executive Order. RAND received substantial funding this year from Open Philanthropy, an EA organization supported by Facebook co-founder Dustin Moskovits. Notably, Open Philanthropy’s CEO, Holden Karnofsky, who was formerly on the OpenAI nonprofit board until 2021, is married to Daniela Amodei, the president and co-founder of Anthropic. These intricate interconnections within the EA landscape illuminate crucial underlying ties shaping policy and influence within the AI security sphere.

The Politico article also highlighted that RAND CEO Jason Matheny and senior information scientist Jeff Alstott are recognized effective altruists and have prior connections with the Biden administration. They previously worked together at the White House Office of Science and Technology Policy and the National Security Council before joining RAND last year.

Following my reading of the Politico article, I embarked on an extensive Google search and delved into the Effective Altruism Forum, uncovering several pertinent details that provide additional context to my earlier story:

Matheny, serving as RAND’s CEO, is a member of Anthropic’s Long-Term Benefit Trust, an independent body with the authority to select and remove a segment of their Board. His term is slated to conclude on December 31 of this year.
Researchers Sella Nevo, Dan Lahav, Jason Matheny, Ajay Karpur, and Jeff Alstott—authors of the RAND LLM model weights report—have strong connections within the EA community. Nevo, notably, expresses enthusiasm for EA-related initiatives.
Nevo’s Meselson Center, along with the LLM model weights report, received philanthropic funding from sources including Open Philanthropy. Open Philanthropy also allocated $100 million to the Georgetown Center for Security and Emerging Technology, where former OpenAI board member Helen Toner holds the position of director of strategy and foundational research grants.
Anthropic’s CISO Jason Clinton participated as a speaker at the EA-funded “Existential InfoSec Forum” in August, an event aimed at fortifying the infosec community’s efforts to reduce existential risks.
Clinton co-manages an EA Infosec book club with colleague Wim van der Schoot, targeting individuals aligned with effective altruism interested in enhancing their information security skills.
Effective altruism encourages considering information security as a career path, with projects like 80,000 Hours, initiated by EA leader William McCaskill, suggesting that securing advanced AI systems could be among the most impactful endeavors one could pursue.

It’s no surprise that effective altruism (EA) and AI security have strong connections. When I followed up with Nevo to gather additional insights about the EA ties to RAND and his Meselson Center, he mentioned that the prevalence of EA connections in the AI security community shouldn’t be unexpected. Until recently, he explained, the effective altruism community stood as one of the primary groups focused on discussions, initiatives, and advocacy concerning AI safety and security. “For individuals who have been actively engaged in this field for a considerable time,” he noted, “there’s a good chance they’ve intersected with this community in some capacity.”

Expressing his frustration, Nevo criticized the Politico article for its seemingly conspiratorial tone, suggesting impropriety on RAND’s part when, in reality, RAND has been providing research and analysis to policymakers for decades. He emphasized, “This is precisely our primary role.”

Nevo clarified that neither he nor the Meselson Center were directly involved or aware of the Executive Order (EO). He stated that their work did not impact the EO’s security regulations but might have indirectly influenced other non-security aspects. Additionally, he highlighted that the EO’s provisions regarding model weight security were already part of the White House Voluntary Commitments, established months before their report.

Regarding the Meselson Center, which lacks substantial online information, Nevo explained that RAND houses numerous research centers, with his being the youngest and relatively small currently. The center’s focus has centered on areas such as pathogen-agnostic biosurveillance, DNA synthesis screening, dual-use research, and the AI-biology intersection. Though the center presently involves a handful of researchers, Nevo mentioned their plans to expand capacity, with upcoming efforts to enhance internal sharing and establish an external website for broader outreach.

Is the presence of effective altruism necessary on that front? Does the EA fuss really hold significance? It brings to mind Jack Nicholson’s iconic line from “A Few Good Men” – “You want me on that wall…you need me on the wall!” If we truly require individuals on the AI security front – an area greatly impacted by a prolonged shortage of cybersecurity skills – does understanding their belief systems hold importance?

For many of us advocating for transparency from significant tech entities and policy influencers, it absolutely does. As emphasized in Politico’s recent article by Brendan Bordelan, which I admittedly overlooked, these factors will undoubtedly influence policy, regulations, and the trajectory of AI development for the foreseeable future.

An AI policy expert I recently spoke with expressed a thought about policymakers often not viewing AI as a realm tied to ideological motives. Regrettably, as he noted, “they are mistaken.”

The expanding network of effective altruism within AI security | The AI Beat

Leave a Reply Cancel reply