Instytut Podstawowych Problemów Techniki
Polskiej Akademii Nauk

Partnerzy

C. Rojas


Prace konferencyjne
1.  Kozachinskiy A., Urrutia F., Jimenez H., Steifer T., Pizarro G., Fuentes M., Meza F., Calderon C.B., Rojas C., Strassen Attention, Split VC Dimension and Compositionality in Transformers, NeurIPS 2025, 39th Conference on Neural Information Processing Systems, 2025-11-30/12-07, San Diego (US), pp.1-32, 2025

Streszczenie:
We propose the first method to show theoretical limitations for one-layer softmax transformers with arbitrarily many precision bits (even infinite). We establish those limitations for three tasks that require advanced reasoning. The first task, Match 3 (Sanford et al., 2023), requires looking at all possible token triplets in an input sequence. The second and third tasks address compositionality-based reasoning: function composition (Peng et al., 2024) and binary relations composition, respectively. We formally prove the inability of one-layer softmax Transformers to solve any of these tasks. To overcome these limitations, we introduce Strassen attention and prove that, equipped with this mechanism, a one-layer transformer can in principle solve all these tasks. Importantly, we show that it enjoys sub-cubic running-time complexity, making it more scalable than similar previously proposed mechanisms, such as higher-order attention (Sanford et al., 2023). To complement our theoretical findings, we experimentally studied Strassen attention and compared it against standard (Vaswani et al, 2017), higher-order attention (Sanford et al., 2023), and triangular attention (Bergen et al. 2021). Our results help to disentangle all these attention mechanisms, highlighting their strengths and limitations. In particular, Strassen attention outperforms standard attention significantly on all the tasks. Altogether, understanding the theoretical limitations can guide research towards scalable attention mechanisms that improve the reasoning abilities of Transformers

Afiliacje autorów:
Kozachinskiy A. - inna afiliacja
Urrutia F. - inna afiliacja
Jimenez H. - inna afiliacja
Steifer T. - IPPT PAN
Pizarro G. - inna afiliacja
Fuentes M. - inna afiliacja
Meza F. - inna afiliacja
Calderon C.B. - inna afiliacja
Rojas C. - inna afiliacja
200p.
2.  Barceló P., Duarte M., Rojas C., Steifer T., No Agreement Without Loss: Learning and Social Choice in Peer Review, ECAI 2023, 26th European Conference on Artificial Intelligence, 2023-09-30/10-04, Kraków (PL), DOI: 10.3233/FAIA230270, pp.190-197, 2023

Streszczenie:
In peer review systems, reviewers are often asked to evaluate various features of submissions, such as technical quality or novelty. A score is given to each of the predefined features and based on these the reviewer has to provide an overall quantitative recommendation. It may be assumed that each reviewer has her own mapping from the set of features to a recommendation, and that different reviewers have different mappings in mind. This introduces an element of arbitrariness known as commensuration bias. In this paper we discuss a framework, introduced by Noothigattu, Shah and Procaccia, and then applied by the organizers of the AAAI 2022 conference. Noothigattu, Shah and Procaccia proposed to aggregate reviewer’s mapping by minimizing certain loss functions, and studied axiomatic properties of this approach, in the sense of social choice theory. We challenge several of the results and assumptions used in their work and report a number of negative results. On the one hand, we study a trade-off between some of the axioms proposed and the ability of the method to properly capture agreements of the majority of reviewers. On the other hand, we show that dropping a certain unrealistic assumption has dramatic effects, including causing the method to be discontinuous.

Afiliacje autorów:
Barceló P. - inna afiliacja
Duarte M. - inna afiliacja
Rojas C. - inna afiliacja
Steifer T. - IPPT PAN
140p.
3.  Delle Rose V., Kozachinskiy A., Rojas C., Steifer T., Find a witness or shatter: the landscape of computable PAC learning, COLT 2023, The Thirty Sixth Annual Conference on Learning Theory, 2023-07-12/07-15, Bangalore (IN), No.195, pp.1-14, 2023

Streszczenie:
This paper contributes to the study of CPAC learnability—a computable version of PAC learning—by solving three open questions from recent papers. Firstly, we prove that every improperly CPAC learnable class is contained in a class which is properly CPAC learnable with polynomial sample complexity. This confirms a conjecture by Agarwal et al (COLT 2021). Secondly, we show that there exists a decidable class of hypotheses which is properly CPAC learnable, but only with uncomputably fast-growing sample complexity. This solves a question from Sterkenburg (COLT2022). Finally, we construct a decidable class of finite Littlestone dimension which is not improperly CPAC learnable, strengthening a recent result of Sterkenburg (2022) and answering a question posed by Hasrati and Ben-David (ALT 2023). Together with previous work, our results provide a complete landscape for the learnability problem in the CPAC setting

Słowa kluczowe:
PAC learnability, CPAC learnability, VC dimension, Littlestone dimension, computability, foundations of machine learning

Afiliacje autorów:
Delle Rose V. - University of Siena (IT)
Kozachinskiy A. - inna afiliacja
Rojas C. - inna afiliacja
Steifer T. - IPPT PAN
200p.

Kategoria A Plus

IPPT PAN

logo ippt            ul. Pawińskiego 5B, 02-106 Warszawa
  +48 22 826 12 81 (centrala)
  +48 22 826 98 15
 

Znajdź nas

mapka
© Instytut Podstawowych Problemów Techniki Polskiej Akademii Nauk 2025