Skip to main content

Center for Algorithms, Data, and Market Design at Yale (CADMY)

CADMY is an innovative research center working at the intersection of computer science, economics, and data science. The Center aims to support Yale faculty and students with their research in relevant areas and will serve as a platform to host visiting faculty and postdoctoral fellows, promoting ongoing academic engagement and advancement.

With the arrival of the Internet, including rapid increases in the capacity to transmit, communicate and process data and information, algorithms and data have become central objects of interest in computer science, data science, and economics. Data and digital information have become essential for the allocation and distribution of services and commodities worldwide, which includes the design of markets and resource allocation mechanisms.

From traffic navigation apps to social networks, algorithms and data have become essential. Even with the arrival of large language models that build algorithms on massive data sets, these developments in artificial intelligence have only recently accelerated. The question of how to collect, aggregate, and disseminate data among diverse individuals in a decentralized society is critical for the functioning of democracy, as well as fair and efficient markets.

CADMY’s goal is to initiate and support research and teaching around the fundamental questions that arise at the intersection of computer science, data science, economics, and computational social sciences. CADMY aims to support Yale faculty and students with their research in relevant areas and will serve as a platform to host visiting faculty and postdoctoral fellows, promoting ongoing academic engagement and advancement.  

For more information about CADMY and research areas, please visit cadmy.yale.edu.

Latest Publications

Discussion Paper
Abstract

A soft-floor auction asks bidders to accept an opening price to participate in a second-price auction. If no bidder accepts, lower bids are considered using first-price rules. Soft floors are common despite being irrelevant with standard assumptions. When bidders regret losing, soft-floor auctions are more efficient and profitable than standard optimal auctions. Revenue increases as bidders are inclined to accept the opening price to compete in a regret-free second-price auction. Efficiency improves because a soft floor allows for a lower hard reserve, reducing the frequency of no sale. Theory and experiment confirm these motivations from practice.

Discussion Paper
Abstract

As AI systems shift from directing users to content toward consuming it directly, publishers need a new revenue model: charging AI crawlers for content access. This model, called pay-per-crawl, must solve a problem of mechanism selection at scale: content is too heterogeneous for a fixed pricing framework. Different sub-types warrant not only different price levels but different pricing rules based on different unstructured features, and there are too many to enumerate or design by hand. We propose the LM Tree, an adaptive pricing agent that grows a segmentation tree over the content library, using LLMs to discover what distinguishes high-value from low-value items and apply those attributes at scale, from binary purchase feedback alone. We evaluate the LM Tree on real content from a major German technology publisher, using 8,939 articles and 80,451 buyer queries with willingness-to-pay calibrated from actual AI crawler traffic. The LM Tree achieves a 65% revenue gain over a single static price and a 47% gain over two-category pricing, outperforming even the publisher’s own 8-segment editorial taxonomy by 40%—recovering content distinctions the publisher’s own categories miss.

Discussion Paper
Abstract

We study the design of efficient dynamic recommendation systems, such as AI shopping assistants, in which a platform interacts with a user over multiple rounds to identify the most suitable product among those offered by advertisers. Advertisers have multi-dimensional private information: their private value from a purchase and private information about the user’s preferences. In each round, the platform displays recommendations; the user learns product characteristics of the shown items and then chooses whether to purchase, exit without purchasing, or submit a new query. These actions generate a stream of feedback—purchase, exit, and follow-up queries—that is informative about the user’s preferences and can be used both to refine future recommendations and to design contingent transfers. We introduce a class of data-driven dynamic team mechanisms that condition payments on realized user feedback. Our main result shows that data-driven dynamic team mechanisms achieve periodic ex-post implementation of the efficient allocation rule. We then develop variants that guarantee participation and deliver budget surplus, and provide conditions under which these properties can be jointly attained.

Discussion Paper
Abstract

Bilateral bargaining under incomplete information provides a controlled testbed for evaluating large language model (LLM) agent capabilities. Bilateral trade demands individual rationality, strategic surplus maximization, and cooperation to realize gains from trade. We develop a structured bargaining environment in which LLMs negotiate via tool calls within an event-driven simulator, separating binding offers from natural-language messages to enable automated evaluation. The environment serves two purposes: as a benchmark for frontier models and as a training environment for open-weight models via reinforcement learning. In benchmark experiments, a round-robin tournament among five frontier models (15,000 negotiations) reveals that effective strategies implement price discrimination through sequential offers. Aggressive anchoring, calibrated concession, and temporal patience are associated with both the highest surplus share and the highest deal rate. Accommodating strategies that concede quickly disable price discrimination in the buyer role, yielding the lowest surplus capture and deal completion. Strategically competent models scale their behavior proportionally to item value, maintaining consistent performance across price tiers; weaker models perform well only when wide zones of possible agreement compensate for suboptimal strategies. In training experiments, we fine-tune Qwen3 (8B, 14B) via supervised fine-tuning (SFT) followed by Group Relative Policy Optimization (GRPO) against a fixed frontier opponent. The two stages optimize competing objectives: SFT approximately doubles surplus share but reduces deal rates, while RL recovers deal rates but erodes surplus gains—a tension traceable to the reward structure. SFT also compresses surplus variation across price tiers, and this compression generalizes to opponents unseen during training, suggesting that behavioral cloning instills proportional strategies rather than memorized price points.

Discussion Paper
Abstract

Here we provide our solutions to the First Proof questions. We also discuss the best responses from publicly available AI systems that we were able to obtain in our experiments prior to the release of the problems on February 5, 2026. We hope this discussion will help readers with the relevant domain expertise to assess such responses.

Discussion Paper
Abstract

We develop a framework for the optimal pricing and product design of LLMs in which a provider sells menus of token budgets to users who differ in their valuations across a continuum of tasks. Under a homogeneous production technology, we show that users’ high-dimensional type profiles are summarized by a scalar index, reducing the seller’s problem to one-dimensional screening. The optimal mechanism takes the form of committed-spend contracts: buyers pay for a budget that they allocate across token classes priced at marginal cost. We extend the analysis to environments with multiple differentiated models and to competition between a proprietary leader and an open-source fringe, showing that competitive pressure reshapes both the intensive and extensive margins of compute provision. Each element of our theory (token-budget menus, maximum- and minimum-spend plans, multi-model versioning, and linear API pricing) has a direct counterpart in the observed pricing practices of providers such as Anthropic, OpenAI, and GitHub.

Discussion Paper
Abstract

This paper develops a framework in which a multiproduct ecosystem competes
with multiple single-product firms in both price and innovation. The ecosystem
can use data from one product to improve the quality of its other products.
We use the framework to study three regulatory policies aimed at leveling the
playing field. Restricting the ecosystem’s cross-product data usage, or forcing it
to share data with single-product firms, benefits those firms and induces them to
innovate more. However, these policies also dampen the ecosystem’s incentive to
collect data and innovate, potentially raising prices. Consumers are better off only
when single-product firms are sufficiently good at innovating. Facilitating data
exchange between single-product firms via a data cooperative can backfire and
harm them, because it induces the ecosystem to price more aggressively. For both
the data-sharing and data-cooperative policies, there exist data-compensation
schemes such that consumers are better off compared to no regulation.