SRC participant details
The first round poster session of the Modularity 2015 SRC event will occur during both the morning and afternoon breaks on Tuesday, March 17th. The second round presentation session will be on Wednesday, March 18th, at the 10:30am conference session. The following are the five students participating in the graduate division of the Modularity 2015 SRC.
Arik Hadas, Open University of Israel
Leonardo Passos, University of Waterloo
Bahram Zarrin, Technical University of Denmark
Thomas Degueule, IRISA - INRIA
Arik Hadas, Open University of Israel
"A Language Workbench for Creating Production-Ready Extensions to AspectJ"
While AspectJ is the de facto standard for AOP in Java, it has its share of limitations (e.g., a rigid join point model, fragile pointcuts, state-point separation issues, and imperative syntax for advice). Researchers have proposed various remedies to these limitations in the form of extensions to AspectJ. These include declarative syntax for aspects of a specific domain, extended join point models, and other novel interactions between base and aspect code. These extensions need to be developed and evaluated to be used in practice.
However, a suitable language workbench for supporting the development and use of such extensions is currently unavailable. On the one hand, DSL workbenches such as Spoofax provide tools for grammar definition and code transformation, but lack support for weaving semantics declaration. On the other hand, AOP compilers, such as AspectJ's ajc and AspectBench's abc, do not allow one to extend the weaving semantics on top of a common version of AspectJ or to define the weaving semantics needed for different extensions to work together simultaneously. Meanwhile, the Awesome aspect composition framework does allow one to define the weaving semantics required for different extensions to work together simultaneously, but it does not provide the needed front-end capabilities of grammar definition, code transformation and IDE support.
In this research we designed a modular workbench which is composed of Awesome and Spoofax that can be characterized as a workbench for developing extensions for AspectJ. Spoofax and Awesome seem like complementary tools that can produce a suitable workbench when used together as separate tools. However, such a simple composition of the two tools would not provide all the capabilities we look for. The compilation of aspects written with extensions will be different so we will not be able to use the same development environment. In addition, it will be difficult to produce cross-reference views for such aspects or to debug them since Awesome does not parse their original source code.
The novelty in the design is the composition approach. Awesome was redesigned to be able to compile and weave code written in various extensions. The parsing and transformation is made internally using Spoofax tools. With the new design, code written in various extensions can be compiled and weaved as AspectJ aspects are, and since Awesome receives the original source code, it is able to produce the information needed by tools that work with AspectJ such as AJDT and debuggers. Furthermore, our workbench leverages the capabilities Awesome provides as an advanced aspects composition framework and the capabilities that Spoofax provides as an advanced DSL workbench.
For validation we implemented in the workbench three advanced extensions to AspectJ that were proposed by others, namely, COOL, explicit join points (EJP), and closure join points (CJP). Not only were we able to implement with reasonable effort these extensions and fully comply with the specifications, but our implementations also exhibit advanced features that were specified in the literature but not implemented in the original prototyped implementations. To demonstrates how tools which work with AspectJ keep working with extensions when using our workbench, a project in Eclipse that supports COOL coordinators was created and a cross-reference view was successfully produced for them using AJDT.
The modular design makes the workbench maintainable. New version of Awesome that works with the latest version of AspectJ can be created by refactoring the latest version of ajc. In addition, there is a low coupling between Awesome and Spoofax. Spoofax can be replaced with any other DSL workbench that provides grammar definition and code transformation capabilities. However, we chose Spoofax due to its advanced transformation capabilities and because SDF grammars for Java and AspectJ already exist.
Leonardo Passos, University of Waterloo
"Uncovering the Practice of Feature Scattering"
Feature scattering is often said to be bad practice in software development. A scattered feature is not encapsulated in a module; rather it intertwines with other implementation parts. Consequently, the maintenance of a scattered feature hinders parallel development, as coding becomes interdependent. Ripple effects also occur, increasing maintenance effort. Moreover, the mix of features obfuscates code, affecting readability and understandability.
Despite these drawbacks, feature scattering is common. Examples include cases where a system needs to be extended in unforeseen ways, or when developers need to overcome the dominant decomposition imposed by a programming language, or simply when developers want to avoid the upfront investment of an alternative modular solution (e.g., design pattern). In such settings, modularity is not always desirable, nor always possible. Rather, scattering becomes the primary solution. If not used with case, tough, feature scattering may hinder the system long-term evolution. Thus, feature scattering must be carefully managed. Currently, it is yet unknown how to accomplish such management.
Interestingly, different large and long-lived software systems have shown that it is possible to continuously evolve in the face of feature scattering. Examples of such systems span different functional domains, including text editors, database management systems, and operating systems. I argue that studying the evolution of such systems is a requirement towards building a general theory to successfully manage feature scattering. Such a theory could serve as a guide to practitioners when monitoring implementation decay, assessing overall maintainability, identifying scattering patterns, balancing the tradeoffs of an alternative design or implementation technique, or setting practical scattering thresholds or evolution parameters.
To contribute towards building a general scattering theory, I set to investigate feature scattering in one of the largest and longest-living software system: the Linux kernel. The sheer size of kernel (over 13,000 features and 10 million SLOC) and its long evolution history (over 20 years) makes it a unique case study, providing insights that are unlikely to be found in any other system. Moreover, no study has yet investigated scattering in the context of an evolving software system.
To understand how feature scattering evolves in the Linux kernel, I initially set to perform an exploratory analysis on how developers manage feature scattering over time. For scoping reasons, such an analysis has been limited to device-driver features (drivers, for short), covering eight years of kernel evolution. Drivers are the most common feature kind in the kernel, and their evolution shows that: (1) most drivers are not scattered (82% on average), suggesting that the Linux kernel architecture, along with the existing C language constructs, are able to address most of the Linux kernel modular needs; (2) while most scattered drivers are scattered across only the driver subsystem, a large percentage is not (38% on average). Such percentage, however, is never higher than 43% (stable in the last third of releases), suggesting an upper-bound; (3) 3/4 of scattered features have moderate scattering degrees, spreading four (median) to eight locations in code. Although the median is constant in the observed distribution of kernel releases, the latter is heavily-skewed (there are outlier features that are highly scattered, spreading as many as 377 locations in code). The distribution's right skewness suggests a heavy-tailed behavior. Consequently, the mean may not be a representative value of how many locations in code a feature is typically spread across. Therefore, the analysis of the scattering of driver features provides preliminary evidence that the mean is not a reliable statistic to monitor/manage feature scattering.
Contributing factors appear to affect the observed scattering. Specifically, non-CPU-discoverable drivers are more likely (statistically significant) to be scattered outside the driver subsystem, suggesting that CPU-discoverable drivers are easier to be modularized. Moreover, most outlier features concern features that are narrow in purpose, relating to specific system-on-chip devices (e.g., a hardware bus).
The second part of the research aims to confirm and extend the findings of the exploratory analysis. In particular, kernel developers will be interviewed to confirm whether the observed evolution parameters are consciously kept during evolution, or whether they stem from other practices (which will be analyzed). Moreover, developers will be asked about the difficulties in maintaining scattered features, and how they overcome (whenever possible) them.
The evolution parameters and practices that stem from the kernel development are not meant to be universal; they only provide evidence that scattering scales to the limits set by kernel developers, according to a set of principles and practices accepted by the kernel community (consciously or not). Nonetheless, my research serves as a starting point for further research, which shall contrast my findings with the feature scattering found in systems other than Linux. Once many empirical works confirm or refute my findings, a body of knowledge is likely to be created, and a scattering theory shall be born.
Bahram Zarrin, Technical University of Denmark
"Towards Separation of Concerns in Scientific Workflows"
Scientific workflow is an active research area, started by a shift towards data intensive and computational methods in the natural sciences, and the resulting need for tools that can simplify and systematize repeated computational tasks. In these workflows data is usually streamed through a network of processes. These processes run continuously, getting inputs and producing outputs while they run. The input-output relationships of the activities are the dataflow.
A scientific workflow can be seen as a computational experiment, whose outcomes may confirm or invalidate a scientific hypothesis, or serve some similar experimental goals. Therefore, it is important to make it easy for scientists to develop them in a way compatible with and supportive of their actual research processes. In particular, scientific projects are usually exploratory in nature and the specific analyses of a project are hard to predict a priori. To this end, workflows must be easy to modify and to be evaluated in different viewpoints. Most of the current computation models (e.g., Kahn process networks, Flow Based Programming, etc.) which are used to model the scientific workflows are often data-driven. Therefore they mostly emphasize on modeling the inputs and outputs of a systems, and they do not distinguish between the outputs and the attributes of the system. This can increase the complexity and reduce the reusability of the workflows.
The input and outputs of a system are considered as the structural part of the system, while its attributes consist of the qualities and properties of the system. A set of attributes defines a qualitative indicator of a system in a certain viewpoint. Therefore separation of these concerns (outputs and attributes) allows a same workflow to be evaluated from different viewpoints. In addition, this helps to well modularize and encapsulate different qualitative aspects and the structural aspect of the scientific workflows.
In this work, first we extend Discrete Event System Specification (DEVS), which is a modular and hierarchical formalism for modeling and analyzing general systems that can be discrete event systems, to support system attributes, and we use this formalism as the computation model for the scientific workflows. Afterwards, we propose a modeling approach based on the concept of meta-modeling as defined in the Eclipse Modeling Framework (EMF), and the extended DEVS formalism to model domain specific scientific workflows. We evaluate our approach by defining workflows to model waste scenarios within waste management domain.
Thomas Degueule, IRISA - INRIA
"Towards Language Interfaces for DSLs Integration"
Developing software-intensive systems involves many stakeholders who bring their expertise on specific concerns of the developed system. Model-Driven Engineering (MDE) proposes to address each concern separately with a dedicated Domain-Specific (possibly modeling) Language (DSL) closely tied to the needs of each stakeholder. With DSLs, models are expressed in terms of problem-level abstractions. Associated tools are then used to semi-automatically transform the models into concrete artifacts. However, the definition of a DSL and its tooling (e.g., checkers, editors, generators, model transformations) still requires significant development efforts for, by definition, a limited audience.
DSLs evolve as the concepts in a domain and the expert understanding of the domain evolve. A mere example is the addition, refinement or removal of features from a DSL, with possibly the intent to ensure the compatibility between the subsequent versions. Additionally, the current practice in industry has led to widespread use of small independently developed DSLs leading to challenges related to the sharing of languages and corresponding tools. For example, the core concepts of an action language can be shared by all DSLs that encompass the expression of actions. Finally, while more and more DSLs are developed in various domains, recurrent paradigms are observed (e.g., state-transition, classifiers) with their own syntactic and semantic variation points reflecting the domain specificities (e.g., family of finite-state machines).
Given the DSL development costs, redefining from scratch a new ecosystem of tools for each variant of a DSL is not scalable. Instead, one would like to leverage the commonalities of these languages to enable reuse of existing tools. An underlying challenge is the modular definition of languages, i.e., the possibility to define either self-contained or incomplete language components (in terms of syntax and semantics) that could be recomposed afterwards for the definition of new DSLs. To support modularity, DSLs designers should be able to define proper provided and required interfaces for each language component, together with composition operators.
To improve modularity and abstraction capabilities in software language engineering and support the aforementioned scenarios, we advocate the definition of explicit language interfaces on top of language implementations. Language interfaces allow to abstract some of the intrinsic complexity carried in the implementation of languages, by exposing meaningful information concerning an aspect of a language (e.g., syntactical constructs) and for a specific purpose (e.g., composition, reuse or coordination) in an appropriate formalism. In this regard, language interfaces can be thought of as a reasoning layer on top of language implementations. The definition of language interfaces relies on proper formalisms for expressing different kinds of interfaces and binding relations between language implementations and interfaces. Using language interfaces, one can vary or evolve the implementation of a language while preserving tools and analyses defined over its interface. Language interfaces also facilitate the modular definition of languages by enabling the description of required and provided interfaces of a language (or language component). Syntactical or semantical composition operators can then be defined upon these interfaces. Languages interfaces may be crafted manually or automatically inferred from an implementation.
Model types are an illustration of such kind of interfaces. Model types are interfaces on the abstract syntax of a language (defined by a metamodel). Models are linked to model types by a typing relation. Most importantly, model types are linked one to another by subtyping relations, providing model polymorphism, i.e., the ability to manipulate a model through different interfaces. Model polymorphism enables the definition of generic tools that can be applied to any model matching the interface on which they are defined, regardless of the concrete implementation of their language. Model types can also be used to filter the information exposed from the abstract syntax of a language. Doing so, they can define language viewpoints by extracting the appropriate information on a system for one specific development task of a stakeholder. Model types are supported by a model-oriented type system that leverages family polymorphism and structural typing to abstract the conformance relation standing between models and metamodels with a typing relation between models and model types.
We incorporated these concepts into Melange, a new language for DSLs designers and users. Melange is a language-based, model-oriented programming language in which DSLs designers can manipulate languages definitions with high-level operators (e.g., inheritance, composition, slicing) and express their relations through the definition of metamodels, language interfaces, and transformations. Melange provides DSLs users with an action language where models are first-class typed citizens and embeds a model-oriented type system that natively provides model polymorphism through model typing. We applied Melange on two industrial use cases to maximize the reuse of DSLs ecosystems: managing syntactic and semantic variation points in a family of FSM languages; providing an executable extension of Capella, a large-scale system engineering modeling language.