Introduction
This page is meant as a clearinghouse for discussion of the ongoing evolution of the Puppet language. When necessary, this page will describe how these language changes will be implemented through to the client-side RAL, but generally they will focus on the parser.
Accepted Changes
Language changes that have been accepted but not yet implemented.
Proposed Changes
Language changes with specific implementation proposals.
Generated Language Reference
It has been difficult to maintain a language reference, but the other reference documentation has been successfully being generated from embedded documentation. We should modify the subclasses of Puppet::Parser::AST to support similarly embedded documentation and then enhance puppetdoc to generate a language reference.
This is a pretty straightforward change. Here's the basic process:
- Extend the AST base class to keep track of its subclasses
- Include the Puppet::Util::Docs module into the AST base class
- Add class-specific documentation to each AST subclass where it's appropriate (not all subclasses will need it)
- Extend puppetdoc to generate reference documentation from each subclass.
Desired Features
Features that are desired in the language but have no proposed syntax yet.
Per-Provider Parameters
This feature comes out of a thread on the dev list. The basic idea is that many providers (especially for packages) have specific parameters they need to function properly. The reasoning is that it's not scalable in the long term to add all of the appropriate parameters to the package type, and users are restricted in their ability to extend Puppet because they would have to modify the base type in order to support the needed provider parameters.
This functionality almost definitely entails modifications to the language and to the library, for somewhat obvious reasons, but it might also require modifications to the transportable format and to the database used to store resources.
Luke has mentioned a possible syntax:
package { mypkg:
provider => gentoo { use => ldap },
ensure => installed
}
This is a somewhat nice syntax in that it looks pretty similar to what already exists, but it adds a good bit of language complexity. In thinking about it, though, this will not be possible with more than one provider specified:
package { mypkg:
ensure => installed,
provider => $operatingsystem ? {
gentoo { use => ldap },
solaris { adminfile => "..." }
}
}
It seems like a much better option would be to implement hashes in the language (a long-requested feature), and then add a special parameter for passing these hashes on to the providers:
package { mypkg:
ensure => installed,
extra => { # it's a nice parameter name because it's short, but it's not very specific
gentoo => { use => ldap }, # These are provider names, not platform names
sun => { adminfile => "..." }
}
}
Unfortunately, this will interact very poorly with specifying defaults:
Package { extra => { sun => { adminfile => "..." } } }
package { mypkg:
ensure => installed,
extra => { # this overrides the default, so you get no solaris extra info
gentoo => { use => ldap }
}
}
I think the hash-as-argument idea is the best, but we would need some way to just add to the defaults, rather than entirely replacing them.
Global Configuration Constants
There is often a need for what amounts to global configuration constants, which would be like variables except that they could never be overridden and would be accessible throughout the entire configuration. All of the facts from Facter should be global constants rather than normal variables, for instance, but there are also top-level configuration variables that should be constant and global. For instance, if one has multiple configuration domains, each with their own set of servers, then you would want something like the following:
# Yes, there are better ways of doing this; it's just an example
case $location {
northamerica: {
$fileserver = "na.domain.com"
$puppetserver = "puppet.na.domain.com"
}
europe: {
$fileserver = "eu.domain.com"
$puppetserver = "puppet.eu.domain.com"
}
}
You'd want the $fileserver and $puppetserver values to be globally available and constant.
This will almost definitely require a special syntax. The current lexer supports variables with any capitalization, so it's kind of too late to switch to using lower-case variables for those that can be overridden and upper-case variables for those that can't, but it's probably the best solution. Maybe we should introduce warnings for now, and then switch to this later.
Tags as Global Booleans
Puppet already has boolean tags used in many places throughout the framework, but they're not very formalized and they are not terribly easy to use. Tags should be formally characterized as global booleans that, once set, cannot be unset, and whose values should be consistently available throughout the language.
This would definitely require the language to become significantly more declarative, or it would require that there be a special configuration section for setting these booleans.
The current parser is not at all prepared to parse code like the following:
# if both $webserver and $mailserver are booleans
if $webserver { $mailserver = true }
if $mailserver { $webserver = true }
A real declarative parser would probably throw a warning here, but Puppet's parser is not that smart, so right now it has to compile the entire configuration to get all of the tags set, by which point it is probably too late:
class base {
if tagged(one) { include two }
else { include three }
include one
}
The effect of this code varies dramatically depending on the order in which it is evaluated. This conflict is pretty straightforward to see, but it would be much harder to spot if the test and the inclusion were in different parts of the class hierarchy.
The more I think about it, the more it seems to make sense to turn the node construct into a mini-language for setting these booleans. Class and definition names would still be used as tags, but the booleans would matter more. This would be disappointingly similar to how cfengine has a separate groups section, but it would do a good job of further separating the characterization of a node (that is, what features it should have) from the configuration (that is, what resources those features expand to). ISconf had this separation and it was even more distinct than cfengine's, and in some ways it was very nice.
It also makes integration with external node stores much easier to think about.
With this mini-language, the node code would all be evaluated first, and all of the booleans would be global and constant throughout the rest of the configuration. It would make Puppet a context-sensitive language, though (I think), in that different statements would be valid in node constructs than class constructs, which are otherwise equivalent. Maybe a new construct should be introduced just for this purpose, rather than changing how nodes work. This would help make the feature more equivalent across the use of puppet and puppetd separately.
Relationships Should Be a Language Feature
Relationships are currently handled entirely within attributes of resources, which doesn't make much sense because relationships are about connecting resources, not configuring them. They're more of a layer above the resources, rather than an aspect of them, and it would be nice if the language treated them this way. This would also allow relationship validation during compilation rather than having to wait until we get to the client, but mostly the goal would be to explicitly treat the relationships as a link between resources rather than as a trait of one resource or another.
One big problem with the current state of relationships in Puppet is that there is no facility for specifying relationships between classes, because classes don't have attributes. A special relationship syntax would allow us to specify that one class requires another, which is otherwise currently impossible. This could possibly even be used to determine compile order, which is otherwise unchangeable.
There is also some confusion about the relationship between different resources within a class, or a subclass to its parent classes. Currently, there is no implied or explicit relationship between resources in a class and no ability to specify one other than by setting individual relationships on each resource. A special syntax might allow us to configure classes so that they handled their relationships automatically.
David Lutterkort recently discussed the many types of relationships resources could have:
That's a very common issue in OO modeling, because there's not one, but two ways in which two objects can relate to each other: composition and aggregation. When A is a component of B, then A and B have the same lifecycle, and deleting B means that you also have to delete A. When A is simply an aggregated part of B, then deleting B won't change A at all. (I hope Wikipedia has good examples on this, because it's a subtle and very important distinction)
Having a special syntax for specifying relationships would give us the flexibility to talk about these more subtle distinctions, hopefully without getting insanely confused.
I have no idea what this special syntax should look like, though.
Parameterized classes
Let's say you want to create a puppet class to model a ntp client that synchronizes its clock to some given server. In order to configure your ntp daemon you use a template with a reference to a $ntp_server variable, so you're able to point to different time servers depending on the node you're applying this class to:
class ntp {
package { ntp : ensure => latest }
file { "/etc/ntp.conf" : content => template( "ntp.conf" ) }
service { ntpd :
ensure => running,
subscribe => [ Package[ntp], File["/etc/ntp.conf"] ]
}
}
How do you define that variable? In the current state of puppet this can be done in several ways (have I missed any?):
You may use a potentially huge case statement at some defined place in your manifests to define the desired variable, based on some fact:
$ntp_server = $domain ? {
"firstdomain.tld" => "es.pool.ntp.org",
"seconddomain.tld" => "uk.pool.ntp.org",
default => "es.pool.ntp.org"
}
node machine {
include ntp
}
You may define a different class for every possible external time server, so you can apply that class to any node:
class ntp_es inherits ntp {
$ntp_server = "es.pool.ntp.org"
}
class ntp_uk inherits ntp {
$ntp_server = "uk.pool.ntp.org"
}
node machine {
include ntp_es
}
Finally, you may override the variable in every node including the class:
node machine {
$ntp_server = "uk.pool.ntp.org"
include ntp
}
I think this problem could be better solved if puppet had support for parameterized classes (the concept of parameterized classes is similar to template classes in C++ or generics in Java). A ntp parameterized class could look like this (using a similar syntax to templates and generics):
class ntp < $ntp_server = "uk.pool.ntp.org" > {
...
}
And then, a node could simply include this class using the following:
node machine {
include ntp < $ntp_server => "es.pool.ntp.org" >
}
In general, a parameterized class would look like this:
class someClass < $paramter1 = default, $parameter2 = default,... > {
[class definition using parameterX in its body]
}
A parameterized class could be used in any place a normal class can, as long as all their parameters have an assigned value, either explicitly or implicitly with a default value. Use of partially "instantiated" classes could be allowed in some cases: for example, a class inheriting from a partially instantiated class could lead to the definition of a new parameterized class that would inherit its parameters from the parent class (maybe this is too complicated?).