12345671 of 13
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Typability in Programming Languages
Stockholm University, Faculty of Social Sciences, Department of Computer and Systems Sciences.ORCID iD: 0009-0005-2855-136X
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This dissertation examines the relation between expressivity in programming languages and typability of programs.

For all data used in computer programs, it makes sense to perform a certain set of operations on it. The set of meaningful operations vary between different kinds of data, and this defines the type of the data. This means that all data used in computer programs has a type, but programs implemented using different languages treat information about types in different ways. The type information is accessible at different times throughout program implementation and execution.

In statically typed languages the type of all data is checked during compile time. This can provide programmers with valuable early detection of errors but can also lead to the rejection of programs that would actually run without any errors. In dynamically typed languages the approach is instead to defer checking if operations performed on some data is valid to run time. These checks are typically made at the the point in time when the operation is about to be performed. This allows an informed programmer to run programs that a static type system could not guarantee would be free from errors.

There is a trade-off between expressivity and e.g., early error detection and the possibility to use different kinds of optimisation techniques. This dissertation approaches the trade-off problem from several angles. We have examined existing programs to find out how code is written when there are no constraints from static types, both for the use of polymorphism in the dynamically typed language Python where there are no static typing at all, and access patterns used for arrays in the statically typed language Java. The latter has been used to evaluate the expressivity of array capabilities, a novel technique for statically preventing data races in parallel algorithms manipulating arrays.

Place, publisher, year, edition, pages
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2025. , p. 86
Series
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 25-003
Keywords [en]
Type systems, programming languages
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
URN: urn:nbn:se:su:diva-238649ISBN: 978-91-8107-100-9 (print)ISBN: 978-91-8107-101-6 (electronic)OAI: oai:DiVA.org:su-238649DiVA, id: diva2:1934393
Public defence
2025-03-20, Earth, Borgarfjordsgatan 12, Kista, Stockholm, 13:00 (English)
Opponent
Supervisors
Available from: 2025-02-25 Created: 2025-02-04 Last updated: 2025-02-18Bibliographically approved
List of papers
1. Tracing Dynamic Features in Python Program
Open this publication in new window or tab >>Tracing Dynamic Features in Python Program
2014 (English)In: Proceedings of the 11th Working Conference on Mining Software Repositories, New York: ACM Press, 2014, p. 292-295Conference paper, Published paper (Refereed)
Abstract [en]

Recent years have seen a number of proposals for adding (retrofitting) static typing to dynamic programming languages, a natural consequence of their growing popularity for non-toy applications across a multitude of domains. These proposals often make assumptions about how programmers write code, and in many cases restrict the way the languages can be used. In the context of Python, this paper describes early results from trace-based collection of run-time data about the use of built-in language features which are inherently hard to type, such as dynamic code generation. The end goal of this work is to facilitate static validation tooling for Python, in particular retrofitting of type systems.

Place, publisher, year, edition, pages
New York: ACM Press, 2014
Keywords
Dynamic languages, open source, Python, dynamic features
National Category
Information Systems
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-108661 (URN)10.1145/2597073.2597103 (DOI)2-s2.0-84938809405 (Scopus ID)978-1-4503-2863-0 (ISBN)
Conference
36th International Conference on Software Engineering, Hyderabad, India — May 31 - June 07, 2014
Available from: 2014-10-31 Created: 2014-10-31 Last updated: 2025-02-05Bibliographically approved
2. Measuring Polymorphism in Python Programs
Open this publication in new window or tab >>Measuring Polymorphism in Python Programs
2016 (English)In: SIGPLAN notices, ISSN 0362-1340, E-ISSN 1558-1160, Vol. 51, no 2, p. 114-128Article in journal (Refereed) Published
Abstract [en]

Following the increased popularity of dynamic languages and their increased use in critical software, there have been many proposals to retrofit static type system to these languages to improve possibilities to catch bugs and improve performance. A key question for any type system is whether the types should be structural, for more expressiveness, or nominal, to carry more meaning for the programmer. For retrofitted type systems, it seems the current trend is using structural types. This paper attempts to answer the question to what extent this extra expressiveness is needed, and how the possible polymorphism in dynamic code is used in practise. We study polymorphism in 36 real-world open source Python programs and approximate to what extent nominal and structural types could be used to type these programs. The study is based on collecting traces from multiple runs of the programs and analysing the polymorphic degrees of targets at more than 7 million call-sites. Our results show that while polymorphism is used in all programs, the programs are to a great extent monomorphic. The polymorphism found is evenly distributed across libraries and program-specific code and occur both during program start-up and normal execution. Most programs contain a few megamorphic call-sites where receiver types vary widely. The non-monomorphic parts of the programs can to some extent be typed with nominal or structural types, but none of the approaches can type entire programs.

Keywords
Python, dynamic languages, polymorphism, trace-based analysis
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:su:diva-131566 (URN)10.1145/2816707.2816717 (DOI)000376395600010 ()
Available from: 2016-06-28 Created: 2016-06-21 Last updated: 2025-02-04Bibliographically approved
3. Reference Capabilities for Safe Parallel Array Programming
Open this publication in new window or tab >>Reference Capabilities for Safe Parallel Array Programming
2020 (English)In: The Art, Science, and Engineering of Programming, E-ISSN 2473-7321, Vol. 4, no 1, article id 1Article in journal (Refereed) Published
Abstract [en]

The array is a fundamental data structure that provides an efficient way to store and retrieve non-sparse data contiguous in memory. Arrays are important for the performance of many memory-intensive applications due to the design of modern memory hierarchies: contiguous storage facilitates spatial locality and predictive access patterns which enables prefetching. Operations on large arrays often lend themselves well to parallelisation, such as a fork-join style divide-and-conquer algorithm for sorting. For parallel operations on arrays to be deterministic, data-race freedom must be guaranteed. For operations on arrays of primitive data, data-race freedom is obtained by coordinating accesses so that no two threads operate on the same array indices. This is however not enough for arrays of non-primitives due to aliasing: accesses of separate array elements may return pointers to the same object, or overlapping structures. Reference capabilities have been used successfully in the past to statically guarantee the absence of data-races in object-oriented programs. This paper presents the first extension of reference capabilities—called array capabilities—that support concurrent and parallel operations on arrays of both primitive and non-primitive values. In addition to element access, array capabilities support the abstract manipulation of arrays, logical splitting of arrays into subarrays, and merging subarrays. These operations allow expressing a wide range of array use cases. (edited) This paper presents the array capability design space and show how it applies to a number of array use cases. The core ideas are formalised and proven sound in a simple calculus, along with a proof that shows that well-typed programs with array capabilities are free from data-races.

Keywords
Type systems, Capabilities, Parallelism, Arrays
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-177174 (URN)10.22152/programming-journal.org/2020/4/1 (DOI)
Available from: 2019-12-17 Created: 2019-12-17 Last updated: 2025-02-04Bibliographically approved
4. Progress Report: Exploring API Design for Capabilities for Programming with Arrays
Open this publication in new window or tab >>Progress Report: Exploring API Design for Capabilities for Programming with Arrays
2019 (English)In: ICOOOLPS '19: Proceedings of the 14th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems, Association for Computing Machinery (ACM), 2019, article id 3Conference paper, Published paper (Refereed)
Abstract [en]

In on-going work, we are exploring reference capabilities for arrays, with the intention of carrying over previous results on statically guaranteed data-race freedom to parallel array algorithms. Reference capabilities typically restrict incoming pointers to an object to one (uniqueness), or restrict operations via multiple pointer to a single object (e.g., to only read). Extending such a design to arrays involve operations such as logically partitioning an array so that even though there are multiple pointers to a single array, these pointers cannot access the same elements.

In this paper, we report on the on-going work of a prototype implementation of array capabilities, focusing in particular on the "array capability API design", meaning the native operations on capabilities such as splitting and merging arrays. Using our prototype implementation, we translate several existing array algorithms into using array capabilities and qualitatively study the result. In addition to identifying the need for additional operations, we study what features are commonly exercised, what are the recurring patterns, and how reliance on direct element addressing using indexes can be reduced. We end by discussing a possible design for a more performant implementation once the API is fixed.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2019
Keywords
Type systems, Capabilities, Parallelism, Arrays
National Category
Computer Sciences
Research subject
Computer and Systems Sciences
Identifiers
urn:nbn:se:su:diva-177173 (URN)10.1145/3340670.3342427 (DOI)978-1-4503-6862-9 (ISBN)
Conference
ICOOOLPS '19, London, UK, 15 - 19 July, 2019
Available from: 2019-12-17 Created: 2019-12-17 Last updated: 2025-02-04Bibliographically approved
5. Arrays in practice: An Empirical Study of Array Access Patterns on the JVM
Open this publication in new window or tab >>Arrays in practice: An Empirical Study of Array Access Patterns on the JVM
2024 (English)In: Art, Science, and Engineering of Programming, ISSN 2473-7321, Vol. 8, no 3, p. 14:1-14:31Article in journal (Refereed) Published
Abstract [en]

The array is a data structure used in a wide range of programs. Its compact storage and constant time random access makes it highly efficient, but arbitrary indexing complicates the analysis of code containing array accesses. Such analyses are important for compiler optimisations such as bounds check elimination. The aim of this work is to gain a better understanding of how arrays are used in real-world programs. While previous work has applied static analyses to understand how arrays are accessed and used, we take a dynamic approach. We empirically examine various characteristics of array usage by instrumenting programs to log all array accesses, allowing for analysis of array sizes, element types, from where arrays are accessed and to which extent sequences of array accesses form recognizable patterns. The programs in the study were collected from the Renaissance benchmark suite, all running on the Java Virtual Machine. We account for characteristics displayed by the arrays investigated, finding that most arrays have a small size, are accessed by only one or two classes and by a single thread. On average over the benchmarks, 69.8% of the access patterns consist of uncomplicated traversals. Most of the instrumented classes (over 95%) do not use arrays directly at all. These results come from tracing data covering 3,803,043,390 array accesses made across 168,686 classes. While our analysis has only been applied to the Renaissance benchmark suite, the methodology can be applied to any program running on the Java Virtual Machine. This study, and the methodology in general, can inform future runtime implementations and compiler optimisations.

Keywords
Arrays, Java Virtual Machine, Tracing
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:su:diva-236730 (URN)10.22152/programming-journal.org/2024/8/14 (DOI)2-s2.0-85188949849 (Scopus ID)
Available from: 2024-12-05 Created: 2024-12-05 Last updated: 2025-02-04Bibliographically approved

Open Access in DiVA

On Typability in Programming Languages(4007 kB)68 downloads
File information
File name FULLTEXT01.pdfFile size 4007 kBChecksum SHA-512
20ca137b7fc613ccfb9a437ee5bcd87ef7a09c60223bdc4bccd4ac4db3ba8f43a2998d53f9110d5a81432a08cb691851a98b055a4bbfe49ae4021e2bdf44f2bf
Type fulltextMimetype application/pdf

Authority records

Åkerblom, Beatrice

Search in DiVA

By author/editor
Åkerblom, Beatrice
By organisation
Department of Computer and Systems Sciences
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 68 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 376 hits
12345671 of 13
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf