NFJS Spring 2009 – Day Three
Posted by Jeffrey Hulten in Blog Posts on April 16th, 2009
NOTE: I delayed posting these entries to clean up my notes and add some useful links.
GRUMPY NOTE: I lost a bunch of notes so this is based on my faulty memory and some slides.
I got in a little late today after grabbing coffee and headed to Michael Nygard’s presentation “Architect for Scale”. I love the smell of capacity math in the morning.
Amdahl’s Law
Speedup Ratio
Universal Scalability Law

where
= contention
= coherency
Michael mentioned a book, Guerrilla Capacity Planning by Neil Gunther. It covers a lot of the mathematics of scalability and capacity planning.
There are only two ways to increase scalability: decrease contention or decrease conherency. Why is “improve performance” not on the list? Increasing performance increases capacity. Scalability is the measure of how added resources impact added capacity. Increasing performance can reduce your need for scalability, but does not benefit scalability. An interesting side note was that increasing performance generally means making the serial portion (
) a larger portion of the total time. This means that as you spend more time on performance, you actually create a situation where you will reach maximum capacity with fewer processing resources.
Brewer’s Conjecture
You have heard the old quote ‘Faster, better, cheaper… Pick two”? Well when you are talking about systems architecture you can choose at most two of the following:
- Consistency
- Availability
- Partition-Tolerance
Lets look at Michael’s definitions:
- Consistency
- There exists a total ordering on all operations, and all nodes in the system agree on that ordering at every point in time.
- Availability
- Every request received by a non-failing node must result in a response
- Partition-Tolerance
- The network may lose arbitrarily many messages from any subset of nodes to any other subset of nodes.
You want your system to be consistent and available? Partitioning is not allowed. How are you going to prevent partitioning? Do you expect servers, switches and network cards to never fail? This is unrealistic.
You want your system to be consistent and partitionable? Consistency can only be guarenteed if the service is unavailble during partitions. Otherwise you end up with ’split-brain’.
You want your system to be available and partionable? We maintain availablity during partitions by allowing different subsets to report different histories. This means that agreement or syncronization protocols are forbidden.
An important note: when you use and rely on ACID compliance of a relational database, you inherently select Consistency.
“If you can’t split it, you can’t scale it.” — Randy Shoup, eBay
All partitioning strategies assume no cross-cluster dependencies on shared data. Shared writable data requires serialized access which raised
.
The database theory example for ACID compliance of a bank transaction is flawed as soon as User A and User B are with different banks. Instead of ‘always consistent’ we need to think about ‘eventually consistent’.
I decided to hang around for Michael’s next talk, ‘The 90-Minute Startup’. He talked about Amazon EC2. I lost all my notes on this talk, but seeing the cloud in action was pretty cool.
After lunch we had the expert panel discussion where Ken Sipe, Ted Neward, Matthew McCullough, Brian Sam-Bodden, and Michael Nygard talked about issues like the potential future of software development regulation, the potential impact of the IBM-Sun aquisition, and parallels between software engineering and the medical, legal, and construction.
After the panel I attended Ken Sipe’s session on Java Memory, Performance, and Garbage Collection. I learned a bunch about the theory of garbage collection on the JVM. I picked up a few nice tips. To speed up your JVM startup time set your PermSpace and MaxPermSpace to the values based on what you see while your application is running. This way the JVM is not allocating space, finding is insufficent, reallocating over and over as you start up.
Some useful equations from the JVM memory management talk.


where
= Size of Eden
= Size of New Space
= Survivor Ratio
= Size of Survivor Space
jps allows you to see processes either locally or remotely.
jstatd exposes the process information to jps over the network. A java security policy file is needed.
I stuck around for Ken’s session as my last of the day, “Debugging the Production JVM”. After an example of VisualVM he moved into BTrace, a new feature in Java 1.6. It allows you to instrument a running JVM and get events when methods are called, or other activities take place.
All in all I had a great time and learned a whole bunch.
NFJS Spring 2009 – Day Two
Posted by Jeffrey Hulten in Blog Posts on April 15th, 2009
NOTE: I delayed posting these entries to clean up my notes and add some useful links.
After getting to the hotel about ten minutes beforehand, I started the day with Ted Neward’s talk, ‘The Busy Developers Guide to Java Platform Security’. Instead of a “boring” presentation Ted decided it was story time.
He told the story of deciding that technology was the path to burnout and he would start his own bank, steal from it and then get bailout money. He hires a guard to stand in front of the vault and check the credentials of everyone coming in and out of the vault. He then hires a teller to be the front end of his bank.
TED hires GUARD
GUARD allows all people deposit(*) and withdraw(300)
GUARD checks permissions on BOB depositing $10
BOB succeeds
This is GOOD
GUARD checks permissions on CARLOS (the terrorist) withdrawing $100
CARLOS succeeds
This is BAD
TED hires TELLER
TELLER has deposit(*) and withdraw(300) permission
GUARD only allows people with the right permissions to access the vault
GUARD checks permissions of TELLER (succeed) and then the object that TELLER is acting on behalf of (CARLOS)
CARLOS attempts to withdrawl $100, which he does not have permissions to do
GUARD kills CARLOS with PINKYOFDEATH
This is GOOD
GUARD checks permissions of TELLER (succeed) and then the object that TELLER is acting on behalf of (BOB)
Bob attempts to deposit $10, which he does not have permission to do
GUARD kills BOB with PINKYOFDEATH
This is BAD
TED tells TELLER to show a priveledge flag if it doing something on behalf of another
TELLER starts checking if someone can deposit or withdraw a given amount, shows a priveledge activity flag (business logic)
In this simplified version of Ted’s roleplaying, the following ‘characters’ represented the following parts of a Java-based application:
- TED = Developer
- GUARD = Access Controller, checks permissions to an access domain
- TELLER = Run Time Library (rt.jar) and business logic
- PINKYOFDEATH = AccessControlException
To turn on the security manager you add the -Djava.security.manager flag to the java command line. It references the jre/lib/security/java.policy file for the JRE.
grant [codeBase PATH|signedBy KEY] {
permission PermissionClass ACTION
}
IMPORTANT SAFETY TIP!
To UNION the JRE security policy with your security policy, include -Djava.security.policy=your.policy in how you run the java executable. To REPLACE the JRE standard policy with your security policy, use two equals signs (-Djava.security.policy==your.policy). If the action fails for the target it throws an AccessControlException.
I was impressed by Ted’s talk and decided to stick around for his Busy Java Developers Guide to Advanced Platform Security. The key topics he covered were:
- security debugging
- custom Permissions
- jar signing for server permissions
- custom Access Controller contexts
Security debugging can be activated with -Djava.security.debug=FLAG where FLAG is one of all, jar, policy, scl, or access. The jar flag is for testing JAR signing. The policy flag is good to make sure your policy file is being read and parsed correctly. If a policy file has a syntatical error in is (missing semicolon, etc.), if a policy has a malformed URL. The permissions attempt to be instanciated by the security manager but the classpath is not established so it marks the permission class as unresolved and expects it to be provided at runtime. If the permission policy does not exist, the permission is not granted but the policy as a whole is still in play. The scl flag dumps information when ClassLoader assigns permissions to classes. It isn’t as useful as policy and access. The access flag on java.security.debug will print all checkPermission results. The three suboptions are stack, domain, and failure which show a stacktrace during the checkPermission call, the protection domain during the call, and additional information during a failure respectively. java -Djava.security.manager -Djava.security.debug=access:stack,domain,failure AppClass
public class Util {
public static void doSomething() {
AccessController.checkPermission(new RuntimePermission("insult"));
System.out.println("You're ugly!");
}
}
This checks the RuntimePermission regardless if the security manager is enabled or not. The AccessController class exists NO MATTER WHAT.
public class Util {
public static void doSomething() {
if (System.getSecurityManager() != null) AccessController.checkPermission(new RuntimePermission("insult"));
System.out.println("You're ugly!");
}
}
Now only if the security manager is enabled, the check is called. He then covered writing custom permission classes which is where my typing fell behind. It is pretty simple stuff though. The key seemed to be the implementation of the ‘implies’ method on your permission class. Finally he covered signing jars, which is not particularly useful unless you also include the key in the security policy and get your key signed by a third party such as Verisign.
While my brain did not feel as full as last time, my stomach was empty and I was glad for the lunch break. Buttermilk fried chicken, mashed potatoes and apple pie made for a nice midday meal.
After lunch one of my coworkers (and one of my groomsmen from my wedding last year) and I went to the “Beginning Drools: Rules Engines in Java” presented by Brian Sam-Bodden.
The real joy of a rule engine is that you can move to fact-based evaluation of code instead of massive nested IF/THEN structures.
Rules need to be discrete encapsilation of knowledge. For instance, “if it is sunny outside, you need sunscreen”. The firing of a rule can result in the modification, addition or removal of existing facts. Facts are purely assertions about the problem. For instance:
when: there are two people and they share a only one parent then: they are half-siblings
A rule engine is a system that matches facts and data against production rules. It derives or infers conclusions from premisies by using rules.
The inference engine is the component of the rule engines that matches facts against the production rules. They use pattern matching algorithms such as linear, rete, treat, or leaps. Drools uses the rete algorithm.
Chaining is reasoning using inference rules. The two main methods are forward chaining and backwards chaining. Forward chaining starts with a base set of facts and infers more facts until a desired goal is reached. This type of recursive processing requires a stop condition to let the engine know when to stop processing. Drools does support a timeout that allows you to determine how long you should be allowed to process the request.
Forward chaining is data-driven. Backwards chaining is goal driven. It starts with a list of goals you want to achieve and infers the data that will satisfy the goals by working backwards.
Consequence of a rule should never include the execution of tasks outside of adding, changing or removing facts. Do not implement the things you want to do based on the facts in the consequences, that belongs in other components.
Since the Rete algorithm is the ’secret sauce’ of many rule engines it is good to understand at a high level how it works. The rule engine is composed of the inference engine, the rule base, and working memory which contains all the facts. The rule base in the full set of production rules, which is them read into the inference engine as a Rete network. A Rete network is a tree of nodes, where every node is the predicate of a rule and the consequence of the rule is added to the facts as the input of the next node. Inside the inference engine, rules are matched againsts facts in the working memory using the pattern matcher (Rete). The unordered list of ‘activations’ or matched rules is ordered into the agenda which is the conflict resolver. The first rule of the agenda is fired, which may trigger the process all over again. Drools in particular is a forward chaining inference engine based on the Rete algorithm.
You define your rules either in XML (blech) or the Drools Rule Language that is parsed into AST and then into a package. That package is applied to the working memory, which includes the truth maintenance system, the agenda with agenda event support, and working memory event support. Here is an overly simple example of the Drools Rule Language:
rule "MyRuleName" when predicate then consequence end
In the predicate you match at the class level, so you have to be careful about including primatives like Integer because if you include two integers it will not know which one it is supposed to work with.
I decided to stick around for the Advanced Rules Programming with Drools presentation with Brian Sam-Bodden. To do a proper Advanced class he would need a couple of days, so he said we would probably be at about an intermediate level.
The Drools IDE for Eclipse 3.2 or higher is highly recommended. You will be able to create a new rule project and other project artifacts. As a side note he talked about a nice feature that allows you to use Excel spreadsheets as decision tables for rule generation. This is great if you are in an environment where business analysts are generating rules.
Domain Specific Languages in Drools allows you to extend the rule language to your problem domain, which enables programmers to get away from business policy and focusing on implementation. It simplifies the rule base by removing domain specific duplication and has no performance impact at runtime. The DSL specification specifies natural language on the left and the DRL syntax on the right.
[condition]If there is a Person with the age of "{name}"=Person(name=="{name}")
[consequence]Say {message}=System.out.println("{message}");
You can also add a duration operator that will wait to fire a rule after the conditions are true until the time (in milliseconds) elapse. He covered alot of great examples of DSL rule implementation in Drools, but I am not even going to try to capture them here.
We then moved into grouping rules and controlling flow. After a brief review of salience, agenda groups and agenda filters he moved to his preferred method: rule flows. I did not entirely understand his description, but there is definitely alot of power there. Drools is powerful by itself, but the addition of rule flows puts it over the top into serious mojo. This was a great session. I skipped the birds of a feather sessions so I could have a little Saturday left.
NFJS Spring 2009 – Day One
Posted by Jeffrey Hulten in Blog Posts on April 14th, 2009
NOTE: I am posting these entries a week and a half after the Spring 2009 event in Seattle due to time constraints in editing and such.
It’s my first time as an Alumni at a No Fluff Just Stuff conference. Attendance is smaller than the Fall ‘08 show, but I understand that is pretty typical. I am glad to see that this conference is surviving in the lean times.
After the introduction I attended Neal Ford’s presentation on Emergent Design and Evolutionary Architecture. Neal’s presentation is based around a series of articles he is working on for the IBM DeveloperWorks web site. One of the key points he made was that emerging design is really about identifying patterns (not to be confused with Patterns, such as from the GOF book, Design Patterns: Elements of Reusable Object-Oriented Software).
“Software is more about communication than technology” — Neal Ford
Expressiveness matters! You will probably not see patterns and potential abstractions in Assembler code. The more expressive the language (Python, Ruby and Groovy being great examples) the more likely you are to see the places where you can refactor.
A word of warning! Do not try abstracting too early in the development process. Abstractions should come from working source code, not from your preconceived ideas of what abstractions are needed.
Neal then covered one of my favorite topics, technical debt. Every project has external forces that cause compromise. The key is to convince someone in power (management) that technical debt exists and then start a conversation about repayment.
Demonstration trumps discussion in this case. Use metrics from your project like cyclomatic complexity per line of code. As a process and operations guy I am a huge fan of proper metrics. The more technical debt you have the harder it is to see the emergent design of the project.
In the end, software and application design is about code; all other artifacts are transient and should be tossed when it diverges from the code. Out-of-date documentation is worse than none because it is actively misleading. Depending on the project you may find that keeping your documentation up to date is important, but you need to take the cost into account.
Some references include Neal’s The Productive Programmer and The Productive Programmer
.
After a short break we reassembled for another Neal Ford presentation, ‘Real World Refactoring‘. Since all the major tools provide refactoring facilities Neal focused on when and why to refactor.
First he talked about the building blocks of refactoring. ‘Composed method’ was the first, specifically divide your code into many small methods that each complete one and only one task. If you have a method longer than ten lines you have a candidate for method composition.
Once you have broken your methods up you can see if there are common pieces you can refactor up to a parent class. The key is that in the end a class can be only about one thing, shorter methods are easier to test, and you can find reusable assets where you didn’t know they were there.
The second building block was SLAP or the Single Layer of Abstraction Principle. The key of this idea is that crossing layers of abstraction, like from business logic to database logic, within one method is difficult to visualize and probably a good candidate for refactoring.
“Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.” — Martin Fowler
Moving on he talked about decomposition, specifically taking large classes and breaking them down and decoupling from vendor libraries. It was a little complicated to try and explain which probably means I still don’t understand it entirely. He did state that you should not decompose large things just because you can, they may not be evil. For example, the object representing your most important and strategic concept will probably be large and you may break the conversation with your business owners.
So you want to refactor, one of the questions people ask is if you should branch when you refactor. The real problem is the merge hell you go through at the end. To mitigate that risk, time box your efforts and don’t be afraid to toss your efforts. You miscalculated the effort required and the breakage from weeks of merge hell and broken code are not worth it. Take a step back and rescope your efforts to something smaller.
One of a important refactoring issues for me is refactoring the database schema. For this there is a project called dbDeploy to help you manage the changes to your schema. One of the keys for refactoring is using triggers to synchronize data (such as in the case of moving a column) for the duration of the change window.

Pragmatic Programmers came out with a book called ‘A ThoughtWorks Anthology’ which has an essay called ‘Refactoring Ant Build Files’. It shows how you can apply common refactoring principals to Ant build files to remove the cruft that builds up over time.
After another break I moved over to another room to attend Scott Davis’ presentation on DSLs (Domain Specific Languages) in Groovy. An example of a domain is Facebook, which we would consider a part of the social networking domain. Amazon is part of the Internet sales and cloud computing domains.
One of the dangers is attempting to put too much domain and losing the specific nature of the DSL. The joy of DSLs is the ability to build the language tools specific for the job. SQL is an example of a domain specific language as is ANT. SQL provides a strong language for retrieving data. ANT is intended just for building Java code.
Weekly Routine
Posted by Jeffrey Hulten in Blog Posts on March 24th, 2009
You may have seen my daily routine a while back. On top of having a daily routine, I find its important to have a weekly routine as well. My weekly routine is for things that I don’t want to tackle every day (I am tired when I get home from work and don’t want to spend my time paying bills) but need to get done regularly.
Therefore I have been building a weekly routine. The problem with a weekly routine is that it is harder to internalize than a daily routine.
So far I have managed a little bit of weekly routine at work on Mondays:
- review open tickets and determine priority for tasks this week
- review open tasks for items that haven’t been touched in a couple of weeks
- consider removing those items entirely or moving to a someday file
- if they need to stay, consider their priority or if they can be delegated
- review schedule and meetings this week
- block out large chunks of time for tasks
- leave the small gaps (for me, one hour or less) for other people to schedule meetings in
Here are the things I am trying to make a weekly routine:
- process postal mail (bills, etc.)
- do financial stuff (check accounts, verify payments, etc.)
Stupid BASH tricks…
Posted by Jeffrey Hulten in Blog Posts on March 22nd, 2009
From my coworker, Hiram…
# # search file1 for lines not present in file2 # cat file1 | while read i ; do if ! grep -q $i file2 ; then echo $i ; fi ; done
Got any BASH one liners to share?








Recent Comments