I’m really interested in privacy and how my data are collected, used, stored and just in general exists. While I think of my data as mine, the more I think about it, the hazier I am. I think defining ownership more clearly will help me think of what’s possible and then what’s also appropriate and reasonable to do with data.
I use goodreads a lot because I like to read books and want to find more books to read. This morning I was reading a thread on hackernews on Sarah Manavis’s NewStatesman article and it reminds me of how much I want, I yearn, for useful book recommendations. I think about this pretty much every time I open the goodreads page or app and hope that maybe this is the day they finally fix recommendations, so I figure I would capture some ideas I have for fixing their recommendations.
The August 4th explosions in Beirut were very jarring to me due to the loss of human life, and the numerous, direct videos that looked like something out my 80s youth nightmares. Hearing that it was caused by 2,750 tonnes of ammonium nitrate, I started working on my phone to quantify my fear by figuring out how easy it would be to ship that much, and how much it would cost.
Lately I’ve been thinking about data maturity models within large organizations, and how to measure maturity and ability to use data. Specifically, what it means to be data literate or data fluent or whatever buzzword is used to mean “hip and with it like the cool kids” when it comes to being able to collect, sort, order, and use data to the greatest extent possible. To help think this through, I’m putting down a few thoughts here. My goal is to figure out ways to personally use data better and help the same in others.
I like to track my Tableau visualization projects in git to protect against screwups and for curiosity of what previous versions looked like. Theoretically, it might help with collaborators working on the same workbook, but I’ve never actually done that or wanted to do that. I use live updates for my data sets so the .twb files don’t contain sensitive data and the xml is at least possible to get a sense of what changed. But Tableau stores thumbnail images of worksheets and dashboards as base64 encoded graphics inline the xml. This makes the files really big and I got tired of manually removing these so I came up with this script to clean out the unneeded elements.
I try to read every book by Neal Stephenson because he writes characters that I wish I could be, or perhaps are the closest to from any other characters in fiction. I was excited when Fall; or, Dodge in Hell came out. So excited that I accidentally ordered two hard copies, received one as a gift, and bought the audible version to listen to on my commute. The book is great in general, but specifically it introduces something called a “Personal Unseperable Registered Designator for Anonymous Holography” that all the characters call by its acronym, PURDAH. I’m curious as to how close this is to existing.
I’ve been noodling on a few ideas and didn’t put any into markdown because manually running the jekyll build and pushing to my web server required me to be near the right computer and have a few minutes of really boring stuff. I never got around to using GitHub Actions when it was in beta, but figured I would give it a try so I would never have to think about spending time working on building out new posts. I think I have it set up so all I have to do is push out commit to GitHub and a few seconds later, stuff zips over to my host.
It’s been a long time since Google changed Blogger to no longer output static sites. I’m in a week-long conference and putting together some ideas on how to train and learn about technology concepts and figured if I move my site off of Blogger that I will be more likely to write up something new. I finally got around to updating this blog to not use blogger and instead use Jekyll. My thinking is that I want the dead simplest site config and setup that I won’t be tempted to ticker with and adjust various layout and style stuff. I want to be able to switch between themes as I will inevitably find a theme that I like better. I write up what I used and steps I took to remember how to update this site in the future.
I've been developing using Java on the Mac for a few years now. Historically (pre-2006 as much as I remember) Apple was pretty slow on getting JDKs working (remember the PPC Blackdown project), but as long as I've had by MacBookPro (Tiger+) developing has been pretty straightforward. There are some quirks about running various JDKs, but nothing too frustrating.
This changed with my Snow Leopard install. I upgraded to Snow Leopard a few weeks ago and didn't notice any problems until Friday when I tried doing a build to a Java 1.5 target.
My $JAVA_HOME = /System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/Home
(which has worked fine for a few years)
But when I ran an ant task with build.compiler=javac1.5 (or target=1.5 in the <javac> task), I got classes compiled with java 1.6 (I could tell because when I ran "javap -verbose classname" I got major version=50).
Scratching my head for a bit I saw this message from ant "[javac] This version of java does not support the classic compiler; upgrading to modern" which made me think that something was wrong with my jdk.
When I ran "java -version" I got this output:
java version "1.6.0_15"
Java(TM) SE Runtime Environment (build 1.6.0_15-b03-219)
Java HotSpot(TM) 64-Bit Server VM (build 14.1-b02-90, mixed mode)
This kind of made sense because the java in my path was "/usr/bin/java" which symlinked to "/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands/java".
So I added $JAVA_HOME/bin to my path, like so: "export PATH=$JAVA_HOME/bin:$PATH". Now when I run "which java" I get "/System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/Home/bin/java". But still when I run "java -version" I get the 1.6 output.
So after some googling, I finally come across this google groups article describing how Snow Leopard only includes jdk1.6.
This was confusing as if I look in /System/Library/Frameworks/JavaVM.framework/Versions I see:
but when I look more closely, 1.5 and 1.5.0 both symlink to CurrentJDK. Now there is a download that lets you have jdk1.5, but I found this very counter-intuitive that Apple would do this.
So I thought I would post an article with all the things I googled for an didn't see any matches to perhaps shave some minutes off the next java developer who is getting odd behavior with Snow Leopard. I am sure am glad I got this upgrade for free.
Update [2009.09.28 2251EDT]: I discovered that if I tried to use some version name other than 1.5.0 I started getting this error:
"Shared archive: uninstalled generation
This was caused because even though I copied a 1.5 jdk to /System/Library/Frameworks/JavaVM.framework/Versions/1.5.0-leopard, when I tried to run java it was loading classes from /System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/Classes/classes.jar (which pointed to the CurrentJDK). This cased the error above.
I fixed this by symlinking 1.5.0 -> 1.5.0-leopard.
I've noticed a growing trend with government projects claiming to be open source but then restricting access to source code and binaries. The US government is in an interesting space because technically all of the source code it produces is in the public domain. Of course, being FOIAable and actually running software in a transparent, open, collaborative manner are two different things.
The fact that the US government is moving toward open source is a good thing, but a few sites are troubling me. For example, ForgeMil, the DoD installation of sourceforge, but access "requires a valid DoD Common Access Card (CAC) or a PKI certificate issues by a DoD approved External Certificate Authority (ECA)." This doesn't sound very open to me. Why place these restrictions on viewing "open source" source.
Also, there's CONNECT, the "open source software gateway that connects an organization's Heath IT systems to the Nationwide Health Information Network." It is excellent, that the new administration (and the previous administration) are developing open standards for HeathIT but why should CONNECT force you to register and be approved before you can view their source code.
I think projects really need to review the Open Source Definition before they jump on the open source bandwagon. There are many shining examples of open source projects within the US government (caBIG, Epi-Info, SELinux).
Open source in government projects allows collaboration across federal, state and local levels and also allows for immediate use by third world nations. But if it is open source in name only, not in practice this removes a lot of the value that transparency provides.
Update 2009.7.14: As of NHIN CONNECT's 2.1 release on July 7th, you can now download the source code without registering as a giant zip. This is an improvement, but I still can't view the code repository without registering.
Lately I've been spending a bit of time trying to find and hire programmers (due to having to spend time removing programmers). Hiring programmers is never an easy task as there are really good programmers out there and prying them away from their existing jobs is hard work. How to hire has been written on extensively.
But I've been thinking about personality types lately and which ones fit into projects and make good developers. Many years ago my brother, who is a WWII history guy, told me about Kurt von Hammerstein-Equord's method for classifying his officers. I don't usually take the advice of nazi generals, but this method seems to have resonated with me and every time I tell someone else about it they seem to smile and draw some value.
Basically he had four defining characteristics: smart, stupid, lazy, industrious: "I divide my officers into four classes; the clever, the lazy, the industrious, and the stupid. Most often two of these qualities come together. The officers who are clever and industrious are fitted for the highest staff appointments. Those who are stupid and lazy make up around 90% of every army in the world, and they can be used for routine work. The man who is clever and lazy however is for the very highest command; he has the temperament and nerves to deal with all situations. But whoever is stupid and industrious is a menace and must be removed immediately!"
I think this applies to programmers pretty closely. Regard the quadrant below:
Programmers who are stupid and lazy are all over the place. In large projects there is room for these types as there are always jobs for them to do. More importantly, they aren't dangerous as they'll be wasting time reading 4chan instead of injecting bugs. These are the types that are the reason why project managers micromanage "What do you mean you spent 8 hours editing a label on the welcome screen?"
Next are the smart and industrious programmers. I thought these were the prized employees, the ones you really want. Smart and hard-working is a good thing right? Yes, that's right these are the guys and ladies who will identify the problem and work hard until it's complete. Boy scouts of the programming world.
Closely related are the smart and lazy programmers. These are the guys who really waste time and mess around, but are able to re-use someone else's API to do the work better than spending 20 hours writing it from scratch. The trick here though is to make that they aren't too lazy. I mean you want them coming into work and all.
Finally, you have the stupid and industrious programmers. These are dangerous. These will ruin your project and make you miss dates. "I spent all week-end re-writing the login module so it will only use digital certificates instead of userid and password." or "I wrote a wonder configurator to set the properties in the project. But the configurator only works on a JVM we don't use and there's no other way to configure the project." Stuff like this will make the smart/lazies, stupid/lazies and smart/industrious work overtime to get back to zero.
Update 2009.06.04 2154EDT: As pointed out below by thecodist, Hammerstein-Equord, although in the Nazi army, was actually against the Nazi party, hated Hitler and spent most of his later life trying to stop the Nazis.
I'm working on a project that uses the Globus Toolkit as a secure service container. Globus can be more effort than necessary to run services, but it provides a solid security stack that uses digital certs and mutual authentication through SSL for authentication and encryption. Two-way SSL is rather secure and usually doesn't raise too many eyebrows.
However, one of the security analysts insisted this was insecure because you could use an HTTP reverse proxy between the user and the service. His reasoning was that you can't put the service container in the DMZ as it is too much of a security risk. Instead you put a web server (IIS or Apache) in the DMZ and have it proxy all traffic directly to the service container.
This is frustrating for several reasons:
1) increased cost and complexity - I generally like KISS
2) Limited increase in security. A configuration I usually use is to have the app server in the DMZ and the database inside the local network. The external firewall limits all traffic to 443 (or 80 if you have plain HTTP), protects against DoS, etc. etc. The vulnerability is that someone could compromise the app server if there is some exploit that works on 443 or 80. But with a reverse proxy, everything is forwarded on to the app server, so you can still exploit any 443/80 vulnerabilities even if the app server is within the local network. In fact, this is a greater risk because now someone has compromised the app server inside the local network, rather than a server in the DMZ.
3) Having an HTTP server proxy SSL sessions means that there are now 2 SSL sessions. One between the user and the HTTP server and then one between the HTTP server and the service. This is not only a performance problem, but now you have to delegate the user credentials. This means that the service can't use stuff like mutual-authentication with users because the proxy server is a man in the middle (although one under benevolent control).
So, what I've been doing is asking everyone I know who designs web services if this isn't stupid. So far I've had 4 architects from 4 different companies say that this configuration is unnecessary and that they don't do it.
In the meantime, we're going to use web services without an HTTP proxy server. It seems NIST is on our side, so I'm hoping we'll be able to withstand the "We must triple encrypt stuff" crowd.
A client I work with mentioned that for high security related projects, that developing them in an open source way will actually decrease the security provided
by the project. The idea being that if anyone can see the architecture and code while it is being developed they can prepare to compromise the security. This made
sense at the time and I nodded, but after chewing on it for a few weeks I think this is not the case at all.
There is certainly the argument that implementations are not open source. Of course that makes sense as no one will open up the server configs, passwords, private keys, etc. But the actual software that is used within an implementation gets more secure if developed as open source software.
So here's the short list off the top of my head of security related open source projects that are pretty widely used:
One of my clients came to me a few weeks ago with an interesting challenge: They want enterprise SOA, but have no money. This client is an extremely federated
organization with multiple IT groups all receiving their own funding.
There's definitely a need for enterprise SOA (governance, infrastructure, practices) but no authority to back anything official.
So in light of these restrictions, we're thinking of a two-pronged approach to assisting SOA efforts:
Chris Anderson published Free! Why $0.00 Is the Future of Business last month and it got me thinking about some of the business/technical associations I associate with.
Over the years, I've been a member of a few groups: IEEE, IASA, AJUG, NYJSIG, etc etc. I'm also familiar with some of the major professional organizations: PMI, OMG, IETF, W3C, JCP. Some of these were free (IETF, AJUG, NYJSIG) while others required membership dues (IEEE, OMG, PMI). Some started out free and transitioned to membership dues (IASA).
In some of these organizations, the membership fees very clearly show what they go towards. For example, in the IEEE, you get a magazine, group rates on health insurance, etc. In others, I'm not sure what the dues go for: IASA.
Specifically, for architecture and programming groups, I think that the price for admission should be free and have tiers for additional membership if you want magazines, key chains, etc. This will serve to increase membership while keeping leadership in a purely voluntary capacity.
So this ends up being more closely aligned with open source "societies" like the apache foundation where anyone can be part of the community, join listservs, register for conferences, etc.
I was cleaning out my briefcase and found some notes I had written down about what characteristics I need in an SOA registry (or repository if you're one of those
I've seen almost all of these features spread out across a couple of different products, but I think eventually you will need these in order to have a successful SOA implementation. Each of these items deserves a whole post, so I'll keep it brief in this post.
A last note before I start enumerating is that I think SOA Registries are more useful at the private/internal/enterprise rather than the outdated public UDDI model of 5 years ago. Registries needed to enable an enterprise's modular development, but aren't required for locating public services as it's pretty much impossible to get all the different providers together in one registry.
One of my clients asked me a pretty common question this week:
IBM announced two weeks ago that they would launch a free 12 week SOA mentor program to their IT certification program.
Just email firstname.lastname@example.org to register for the program. There's a real world meeting on Sep 12, but everything else is through email and webex.
You still need to pay for the exams, but free classes are cool.
I was recently at the PHIN 2007 conference facilitating a stakeholder group on collaboration and witnessed the following conversation with state and local public health partners:
Person A: We would benefit if we had a common strategy defined that we could follow.
Person B: Yes, we could define our processes so we could compare what we have in common and collaboration on systems.
A: Then we could create a plan for how to align our IT investment with business drivers (seriously, they said exactly this)
Person C: That's called enterprise architecture, you're talking about enterprise architecture.
B: No, don't say that word.
A: Yeah, what does that even mean? We looked up "enterprise" in the dictionary and it means a risk taking endeavor.
"The data repository contains the raw data received from partners. The data is then processed with business rules applied and loaded into the data warehouse. The data warehouse has analysis and structure for the aggregate data."
"Sorry, I don't know what you just said."
My google alert picked up Stefan Tilkov's post agreeing with Radovan Janecek's Anti-SOA post. So I will add in my own equally valid opinion to this chain.
To recap, Stefan and Radovan think that commonly accepted infrastructure pieces like ESBs and BPEL are actually detrimental to SOA.
While some of their points are accurate (encouraging P2P service communication rather than funnelling all SOA traffic through gateways or intermediaries), the term Anti-SOA is just provocative.
ESB/WSM/BPEL are all valid components of SOA infrastructure because they all serve valid purposes. Trying to create and maintain a single ESB or a single BPEL engine is fruitless, but so is trying to build a highly available, extreme transaction environment without orchestration, transformation, reliability, security, etc that these tools provide. (sorry about the run on).
My experience is that if you try to deliver on a good SOA without providing the appropriate infrastructure and tools, then you will fail. If you try to create a solid governance policy without delivering the support that you find in ESBs and Registries then it will make your task that much more difficult.
My take on ESBs is that they are a good place to run services and they provide a lot of needed infrastructure (security, transformation, reliability, persistence, orchestration, on and on) that you need to implement somehow. ESBs should not be treated as the end all service environment for an organization and should be designed to work with external and internal partners (especially since that line is getting blurred more and more) services regardless of what kind of implementation provisions and executes the services.
At the end of the day, an organization needs the services provided by an ESB or a BPMS or a Registry/Repository or a Management System. But it is just one piece of the SOA implementation. If a vendor comes by and says ESB = SOA then question loudly and frequently.
My company recently sponsored me to become a certified enterprise architect through the Federal Enterprise Architecture
Certification Institute. It was a pretty interesting program and I thought I'd post a few thoughts here in case anyone else is considering plunking down
the money and time.
The program is a mixture of in class seminars, online coursework and exams, real life exams and oral presentations. The end goal is to educate and certify that you know Enterprise Architecture (particularly FEA/FEAF and/or DoDAF) inside and out.
We had two full weeks of classes at the Virginia Tech extension campus in Falls Church, Virginia followed by two days of exams and presentations. In between class sessions we had about 15 homework team assignments, each requiring 1-5 hours of work, so we were kept busy through the months of June, July and August.
The Department of Health and Human Services was sponsoring the training with about 20 employees with 8 contractors also attending and paying their own way. This meant that the traditional curriculum was modified a bit to also cover the HHS Enterprise Architecture Framework.
We ended up taking four courses:
So two weeks ago, I was attending the Gartner Architecture, Development and Integration Summit in Nashville. It was my first Gartner conference and had its ups and downs, but that is not why I'm writing.
Gartner has these gaps in the agenda where only sponsors present. The presentations are usually dry (although I saw some good ones's- specifically BEA's AquaLogic and the Infravio/WebMethods/Software AG sessions), and you have to either attend or wander around the Opryland hotel. Since this was the third day, I chose to sit in on Sun Microsystem's "Futureproof SOA" presentation that Ross Altman was giving.
Ross is the CTO Business Integration Platforms Company and seems like he was up on the marketing speak. His presentation seemed like the least worst so I sat down for an hour to listen.
The part that struck me was how Sun had some new web service, orchestration products that could create new applications with "Near Zero Code"tm (I guess Sun trademarked this so no one else would steal their excellent tag line). The speaker went on and on about all the new wonderful possibilities and specifically this new feature that the "model is self documenting", etc etc. It sounded like the good old days of someone pitching a 5GL.
So I'm wondering about why it's so great not to write code when building applications. It's like some marketing wonks are sitting around talking about how code is so confusing, and if only they created some new xml-driven, supertool the analysts could write the apps directly. I've complained about this a bit here so I shouldn't be too surprised to keep hearing it.
But the alternative to writing code is training your team to use some proprietary tool to crank out reams of nasty, nasty models or xml or whatever. The last time I coded, the latest IDEs did all the grunt work for you. Between IntelliJ, Eclipse and even NetBeans programmers don't really spend a lot of time writing out wrote code.
So, it was especially comical that in a "Futureproofing SOA" presentation, one of Sun's CTOs was advocating extreme model driven development using some tool with minimal exposure, learning new techniques, trying to get analysts thinking logically and most importantly not using the only good thing Sun produces, Java.
In other news, I'm in DC this week working on my FEA certification that the FEAC Institute teaches. So far, I'm in the second week and they've had some really great speakers and teachers. I'll let you know how useful the classes and certification end up.
Lately a few people have asked me about what Enterprise Architecture is and how they can learn more about it because that's what they'd like to get into.
Now of course this is rather curious, because who is out there doing so much PR that people who don't know what EA is want to become an EA.
While we all ponder that, I have collected a few good starting places that I've found useful. I'm not really separating whether these are just for Public Services/ Federal EA or open to the public sector too. There's enough overlap that I'm not making separate lists:
This morning I was downloading the latest episode of twit as part of iTunes 126.96.36.199 syncing my podcasts and I noticed it taking forever. The mp3 file is only 34 megs or so, but it was scheduled to take over 10 minutes. I'm downloading from my client site and the connection is very quick over here.
As a test, I went to the twit site and downloaded the mp3. Only 50 seconds using Internet Explorer.
I went back and checked with iTunes and it had timed out and only grabbed 10 megs of the episode.
I thought maybe the podcast description points to a different location than the web site, by checking the podcast xml that shows up in the podcast description in itunes, I see:
<title>TWiT 84: Hahn, I'm Home!</title>
I'm procrastinating from preparing a powerpoint deck to present some SOA principles to a group of consultants that my sub-contracting client employs. So I noticed that I haven't updated my blog in quite a while.
The real reason behind this is that I don't really interact with Java that much from a day to day basis. But I did recently work with a group that is trying to get JBoss approved at a US federal government agency. This is more difficult than it sounds as it involves security audits and extra attention because of JBoss' open source status
Since the EA group advises new products accepted in the enterprise and we think OSS is beneficial to the enterprise we worked with the IT group to prepare JBoss for acceptance. This involved creating a baseline configuration document that could be used for the security evaluations. Interestingly, this isn't required for app servers from IBM and BEA, but that's life. JBoss went ahead and prepared the document that we could reuse internally. Again, interestingly, JBoss does not have such a document published on their web site. It would be very useful if they worked with a federal agency to certify JBoss much like RedHat has done with its Linux distro.
After the baseline configuration came some practical testing. Using encryption to prevent any clear text userids and passwords. This required some code changes as JBoss doesn't natively support this out of the box. But because of JBoss' open source, the changes were pretty easy to make. This time, I didn't make the changes, but a pretty knowledgeable Java guy did all the heavy lifting. Each app server addresses encrypted passwords a little differently, I would like to see the JCP address this so a standard approach could be used to prevent the storage of cleartext. I've also seen this work in public corps who are bound by Sarbanes-Oxley. Again, it is possible for each app server, but I'd like to be able to do something JCP standard in my distribution ear.
Other sideline news: I've started making lists with docs.google.com. It's not perfect yet, but useful for the basic functionality list making that I started out using MS Works/Excel with. I've added some lists to my site template: magazines subscribed to, future blog posts. Email comments if necessary.
More sideline news: iPhone looks cool, I may hold off buying a Blackjack. I'm searching out for useful handhelds as I think the blackberry is an aesthetic nightmare and BBB does not seem like the lifestyle for me.
I'm not the biggest sports fan. Neither is my friend, Carlo (name changed to protect the innocent). Because of this, we will frequently have this conversation when necessity dictates that we talk about sports.
Carlo: Did you see that game?
Me: Yes, that was a close one. What did you think of that critical action that made a major difference to the outcome?
C:That player is overrated. The decisions by the referees were horrendous.
M: Definitely the worst I've seen this season
RandomArchitect1: What is your take on the WS-TX standard [or any random acronym]?
RandomArchitect2: Well, it's still pretty early to say. Isn't the new spec out soon?
RA1: It just came out two or three weeks ago, and the new version has some great new features.
RA2: Yeah, but how long will it take to pass our governance reviews?
So curiously enough, the architect titles came up again and again in my interviews. Each company I interviewed with asked how I defined "Architect" and the different stages, and each one defined their own hierarchy differently.
So after a month or so of interviewing, I accepted a position as an enterprise architect sub-contractor to a Big5 consulting company at a pretty interesting client here in Atlanta.
Again, curiously, after all the wondering about future career paths, I ended up going back to consulting. Where skill directly corresponds to income. So far it is very challenging and interesting.
I'm working with Enterprise Architecture (see TOGAF, ZIFA and FEA) across very large (multi-billion dollar) IT groups. So I'll be adding some posts on what I think "Enterprise Architecture" consists of.
Since one of my main responsibilities is to keep up with goings on in the SOA/EAI/EA space, I try to read a ton of blogs and web sites.
eBizQ has been pretty handy with their free webinars. They are usually dull (as their archive shows such gems as "Measuring the Value of BPM", "The ROI of SOA" and "Where Data Meets SOA: Data Services"), but are actually a pretty decent source of ammo for when the suits start asking the tech group about why we should spend money on SOA initiatives.
This isn't an astroturf post, just a useful site that I wanted to share with the SOA/EA crowd. Add it to your filter along with the other free IT magazines and sites (Infoworld, Baseline, CIOInsight, etc etc).
I'll eventually gather all these web sites, magazines and blogs into a google spreadsheet or my del.icio.us.
A few weeks ago, I read a serverside post that referenced
marty andrews' post about defining software architecture roles.
These posts happened just in time, as in the past two weeks I've been asked this question over and over.
My employer has been trying to define what exactly an architect is so they can create a career path for other engineers and architects. Currently we have:
I started to reply to ginni's comment to my last post but ran out of room so I will expand on a separate post.
Since specifying the provider wasn't working properly, I had to skip the JCE API and call out to BouncyCastle directly. Following the 1.34 javadoc, I wrote some code like this:
byte clearBytes = myString.getBytes("UTF8");
org.bouncycastle.crypto.digests.MD5Digest md5Digest = new MD5Digest();
byte hashedBytes = new byte;
byte base64Bytes = org.apache.commons.codec.binary.Base64.encodeBase64(hashedBytes);
String displayableHashValue = new String(base64Bytes, "UTF8");
I recently worked around a curious multithreading bug on IBM's AIX JRE. It was one of those painful, but interesting bugs that I thought I should share.
One of the developers I work with reported an issue with a piece of code that generates GUIDs. The error only manifested under heavy loads running in OAS and only on IBM's AIX JRE (build 1.4.1, J2RE 1.4.1 IBM AIX build ca1411-20030930). Everything ran fine on Sun Windows/Solaris/HPUX and IBM z/OS & Windows, only AIX's JRE had a problem.
The error was a pretty basic "'String index out of range:12'" that occurs when you try to substring a string without enough characters. But this wasn't the cause. The developer who wrote the code was using the MD5 algorithm from the JCE to hash up some random data and he was eating the real exception (of course really really bad, but that's another story).
Here's the defect:
java.security.NoSuchAlgorithmException: class configured for MessageDigest(provider: BootstrapProvider version 1.1)cannot be found.
MessageDigest dm5Digest = MessageDigest.getInstance("MD5");
MessageDigest dm5Digest = MessageDigest.getInstance("MD5", BouncyCastleProvider.PROVIDER_NAME);
MessageDigest dm5Digest = MessageDigest.getInstance("MD5", new BouncyCastleProvider());
Today I was having a problem with my
The Servlet specification's web.xml DTD provides the
The javax.servlet.Filter was not processing for my STRUTS actions. This was strange as everything was working fine on developers' local workstations and our test environments. Of course the devs use OC4J and the test environments run OAS, so there was some differences that could lead to this problem.
The pattern a developer is trying to use is: "/path/*.do". You would think that the app server would apply the mapped filter to any http request ending with .do in the /path/ path. OAS, Tomcat and OC4J think this as well. However, WAS sees this as not matching the servlet spec and checks for the literal http request "/path/*.do".
It looks like section 11.2 of the Servlet 2.3 spec has the following to say about defining mappings:
for path mapping.
As it is currently written, an architect has to plan the structure of his application depending on what servlet and request filters he expects to use. This is an unpleasant limitation as the implementation affects the structure and is hard to change several years into a project.
I'm not sure if OAS and Tomcat specifically extended support of the spec, or they were just too lazy to use * for anything other than a string wild card match (a good thing).
So WebSphere is adhering to the letter of the law, but not the spirit. I hope that WAS6 or WAS7 make this change.
I will add a comment to the new JSR for Servlet 2.6 when it is created (jsr154 for 2.4/2.5 just wrapped up in May), as I think this is a bit of useful logic that should be in every servlet container.
Since my company is rolling out all our web services across our SOA stack I wanted an easy way to test web services using SOAP over HTTP. More specifically, I wanted a tool easy enough to show biz/ analyst people how to execute web services.
My requirements were pretty simple, given a WSDL file, present me with a UI to enter all the fields, then submit the request and show me the response in a relatively pretty view.
I guessed this would be really simple to find as executing web services is a pretty common need. I googled around for a suitable client program and didn't find much. First I found .Net WebServiceStudio. It presents a basic UI, but given a WSDL url or file it will parse it and give you an input pane for all the request doc fields. It even does basic validation and will give you drop downs if your WSDL schema defines enumerations. This seemed cool, but because .Net sends a Byte Order Mark/ BOM on its UTF8 request docs, java blows up. Our Service Runtime Engine is implemented in Java so it uses the Java xml libs. This is one of the instances where Microsoft is right and Sun is wrong. Sun decided not to fix this defect in the JRE because it might mess up backwards compatibility. Hooray Sun! As soon as Java is open source this will get fixed pretty quickly.
Since all the MS consoles were ruled out, I kept googling. Eclipse has some good plug-ins like the Web Services Console Plugin and a ton of others that give this functionality. But when I tried to show how to set up Eclipse and plugins to our analysts that didn't work out so well.
There's also a decent plugin for my preferred Java IDE, IntelliJ, but I can't hand that over to analysts because they don't have IntelliJ licenses.
If you are rich and/or are employed by a rich company you can use Progress Software's Stylus Studio ($150-800 per license) or Altova's XmlSpy ($1200-2200).
In the end I found Integration Central's Quasar a best fit mainly because it's cheap (only $90) and simple. The UI is clunky and leaves much to be desired, but since the only feature I want is to submit SOAP document style requests over HTTP, it is good enough for now. Our vendor has committed to adding a work around so MS-based WS clients can invoke our services and when that happens we will switch over to Microsoft's tool kits.
But I really expect a decent, Open Source web service client tool to come out in Java or Python or something any day now. It probably doesn't exist now because nerdy OS guys are probably content to use Eclipse/IntelliJ plugins. One more item I've added to my handy dandy google spreadsheet of software projects to write.
PS- I find it odd that google spreadsheets doesn't have a public URL to view as HTML or something. This would be pretty useful.
A colleague of mine pointed out this Financial Times India article that is pretty interesting.
I'm a firm proponent of off-shoring (not necessarily outsourcing) and developing software abroad. In fact, I think that if you aren't off-shoring properly now, you won't be in business 5 years from now.
This article talks about how salaries in India's IT sector are increasing dramatically for mid-level to senior-level programmers and other techies. Software developers with 5 years are averaging $56k/year (Rs 25 lakh). Developers with 10-15 years are averaging $222k/year (Rs 1 crore).
This is amazing. Couple this with the purchasing power living in India and I'm ready to pack my bags and move over. My visa is good for 4 more years, I'd seriously like to pick up a job pulling in $200k. Not too shabby.
According to the article, it's actually cheaper to hire in the US than in India. And that's just from straight salary comparison, not including infrastructure, benefits, etc.
I think these salary trends show that the quality of software coming out of "new IT" countries is really improving.
Oh yeah, I stopped waiting for Apple to buy Tivo and bought a Microsoft Media Center PC. I feel dirty because it's an HP running Windows and I bought it from BestBuy. But, the interface is great and now I can watch my torrents on my TV and store recorded shows on a 1TB disk array. Lets see you store 2000 hours on your Tivo.
The tech economy has heated up and seems to be back to 2000 era levels. The unemployment rate for technology is down to 2.5% so I don't talk to my colleagues too much about changing careers.
But a few years ago, it was a different story. A lot of time was spent worrying about what to do when we all lost our jobs programming and analyzing and dba'ing. I knew some people who had sort of armageddon nightmares of starting a carpet cleaning business or even a QuickTrip (although this idea seems golden if you can find property). But most of the time we settled on two career paths: law school or an MBA.
The hardest part about leaving IT is leaving the relatively decent salary that goes along with programming, testing or admining. If you move to consulting or marketing or something like that you end up at the bottom of the ladder and take a big pay cut. Lawyers and business suits have a good potential to start off at or above the pay level of an IT position.
The idea is that you can get either of these special degrees while keeping your day job and finally transition off to a good job once you're ready (or forced to). I live in Atlanta, so there are a lot of education choices available. For law school, there's Emory, Georgia State, University of Georgia and Mercer. For business school the list is similar but also includes Georgia Tech. Most major cities will have a similar selection. If you're lucky enough to live near New York, Boston or San Francisco then you are extra lucky and have the best selections in the world.
If you end up with a law degree you can easily pick up IP/Trademark law, or use the high-stress experience and tech skills to do well in real estate, immigration or labor law.
If you end up with an MBA, you can go into accounting, management or the kind of consulting that gets you a signing bonus and 50 weeks of travel each year.
There must be other careers out there that pay well and utilize the same problem solving and ever changing skill sets that IT professionals use. I'm sure that as soon as the next downturn starts I'll have time to discuss this with my fellow co-workers as we wait for the next round of layoffs.
My company is kind enough to shell out and send me to JavaOne this year. One of the may ways they are pretty decent. Since I'm here I figure I'll add to the
infinite supply of JavaOne blogs and add my first impressions.
So, my project does a lot of in-memory xml manipulations. This means we add, remove and even move xml nodes within a document. We chose dom4j because it has the fastest writes, edits and xpath (using jaxen) among dom4j, jdom, xerces+xalan and xerces+jaxen.
Anyway, I started running into some strange errors when I tried to move an element within a document. A move consists of detaching an element from it's parent and assigning it to a new parent either after or before a particular node.
Dom4j lets you position new nodes by providing a List interface to the underlying structure. So you get a particular parent element and then get the list using either Branch.content() or Element.elements(). content() gives you all of the child nodes, elements() gives you only the child nodes that are elements. Now that you have a list you can add an element using the standard List methods add(Object) add(int,Object).
The problem arose when on this line of code:
java.lang.IndexOutOfBoundsException: Index: -1, Size: 12
The projects I design and work on are maturing (at the ripe old age of two years). And so they require some maintenance.
There seems to be two approaches to fixing bugs: random assignment and area ownership. I'm pretty biased toward ownership but I'll try to stay neutral.
Random assignment means that whenever you find a bug in the system it gets assigned to whomever is free. This is good because you don't have a lot of down time and the bug gets assigned quickly and lets the original developer work on whatever they are currently working on. This is bad because frequently the person assigned has no idea how the flux capacitor works and ends up asking the original developer anyway. This seems favored by lots of project managers. Not sure why, I guess it's the idea that all resources are interchangeable or they don't want to have developers taken off new development to do maintenance. I haven't figured this out yet, but please feel free to conject.
I favor the ownership approach where programmer A writes some code and then as long as anything breaks, programmer A fixes it. This is good because programmer A knows the code and can fix it the quickest, but is bad because programmer A is probably busy on something else. This is also bad because programmers don't like to be digging up code they wrote 5 years ago and fixing it. A similar approach is that a team of programmers work on an area and bugs get farmed out to the entire team.
I think the best way to knock out software bugs is sort of a hybrid approach where teams own code and figure out who knows it best and moves them onto the bug temporarily until the bug is fixed. This means that the project plan has to set aside a certain amount of time for maintenance even when developers are working on new code. This should fix the problem of programmers spending 8 hours fixing the legendary "5 minute bug" (I'll post about this some other time- I had a CEO who constantly had these "5 minute fixes" that he wanted).
So lately I have been thinking of true service oriented development/programming. Of course, SOA and web services are excellent. Allow easy reuse of code and true distributed applications. But with SODA, you solely develop using services. SODA/SOP ends up very use when you are aggregating services together and transforming. What ends up being cumbersome is when you use a tool like WebLogic's WLI or webMethods or similar SODA work benches to assemble applications and code business and presentation logic in these design tools.
So you end up with some sort of proprietary store of business/presentation logic that can be accessed using SOAP. In and of itself, it could be worse. But if you spend thousands of hours assembling an application in webMethods and 5 years from now want to use some other tool or runtime, good luck.
Remember back in the 90s when 5GLs were talking about how easy it is to build client server (and eventually web apps) using tools like Informix, Oracle Forms, PowerBuilder, Clarion, maybe even VB? Ever try to port or maintain one of those apps? Ever try to figure out why someone's Forms app is performing poorly because they WYSIWYG'd their way to a crappy app? It's unpleasant. Any time you program in some super high level language it's only good for the first 90% of the functionality. That last 10% takes a lot of time and custom code and this 90% differentiates you from your competitors.
There's another dark side. I recently learned that COBOL is not free. If you want to run COBOL programs you must pay Fujitsu or MicroFocus or IBM or somebody for your runtime licenses. Not just for IDEs, but just to run a language. As someone who grew up with C and Java, this was surprising to me. Even Microsoft doesn't charge for the VB/C# runtime.
So if your app is coded in some 5GL SODA tool, you must pay the vendor, forever. So as long as your app exists you will be paying someone just to run your newly constructed services. This is bad if you now want to sell your $1MM system to smaller companies as $50k packages as you're saddled with a big license fee.
What's the alternative? This is a bit tricky and I don't have the answer. But I think if you get your business logic coded in something like C or Java or Perl, you can run in many free runtimes and operating systems. There are several open source projects to expose code as web services. Axis is pretty mature and is great for exposing Java objects as web services.
Sure you lose your fancy design tool, but you gain the flexibility to embed your code anywhere and 10 years from now you can use the same logic units in whatever the new SOAP is in 2016.
So the SODA tools demo great. If you have a simple app that just needs to call a service and show some data, go to town. But if you have a complex application with hundreds of screens and thousands of services, think about the future and design with tools that are easy to code in, maintain and debug.
I'm not a Mac person, but lately their hardware and prices have gotten appealing. I'm still scarred from my youth when I had a Mac+ and it wouldn't run all the games my friends could run.
The new Mac Minis came out Tuesday and they are a huge improvement over the previous versions. 1.5GHz dual-core for $600, it's a good price. The thing that I really want out of a Mac Mini, and that I expected to be released on Tuesday, is a Tivo/DVR/PVR like functionality. The new minis have FrontRow which is great for playing video, music and pictures.
But I need something that can pause live tv, rewind, record shows, etc. This software is fairly trivial and the mac mini is now strong enough to support live encoding/decoding of video streams. It even has that cool remote. But I think the reason they didn't include this software is because Apple will buy TiVo for its software and bundle it with the newer mac minis that come out later.
This would be a pretty good match as they would have an even better delivery platform for iTunes video.
As soon as the Mac Mini comes out with PVR or DVR tech, I'm buying one (with or without TiVo). There's got to be a huge market for $500ish boxen with PVR capability and no monthly fee. If Apple bought TiVo, they could clean up their software, DRM nightmare that prevents you from doing anything other than watching tv on it.
A few years ago I joked with a friend that she should be in marketing because she had never read Lord of the Rings. I was joking as she's actually a pretty accomplished architect and developer, but it's been in the back of my mind for a while. When I was a youth, all the good programmers I knew were 2600-buying, RPG-playing (pen and paper, Nintendo, pc and mud), David Eddings-reading, Star Trek quoting nerds. These aren't your new fashionable nerds who go to the gym and know how to remove DRM from iTunes songs. These were real "afraid of girls" kind of nerds. The kind of guys who would come in over a week-end and re-write the entire app so it would be cooler (of course this is still the late 80s and early 90s so I'm not talking real programmers).
So as a youth I always equated technical prowess with nerdiness. Here's a very basic chart (showing elvish before the movies came out, so it was much, much worse). I got to grow up around pc shops when they were run by super nerds who built their own boxes out of computer shopper and let you borrow civ1.
But as I entered the work force, I started coming across programmers who were doing it for the money, not because it seemed so cool in an Arthur C. Clarke or William Gibson book. Now there are more competent programmers who wouldn't know Star Wars from Star Trek and don't remember Legos back when they were really cool and castle walls came in 50 pieces not one.
I guess this trend is a good thing. If you've ever tried to discuss design or functional testing with someone who hasn't bathed in 3 days and responds in Klingon you know how unpleasant it can be. I still think that someone who loves programming will outperform someone who just is in it for the money, but I guess you can now love programming and not be a nerd.
There's a lot of info every day and every week and keeping up with it all is a challenge. For newsfeeds there's sites like theserverside.com, digg.com and even
slashdot.org if you want to keep up with the latest big name tech or apple product. But it's harder and harder to get valuable content that makes tough tech
I tend to read a lot of magazines and journals as they are printed on paper and I can read them while I'm offline. The good thing about IT journals is that if you qualify, many are free. Here's a list of magazines that I read frequently:
OK, so imagine you're about to buy some new J2EE application. It could be really complex and run on a full J2EE server with EJBs and JMS and XA transactions, or it could be a simple servlet engine that can run in resin or something. But whatever it comes, there comes a time when someone asks "How many users can it support per processor?" or "What is the average response time under load?"
The potential vendor can hand you a white paper like this one from SAP or this one from IBM. It will be great and list out what processors they use, amount of ram, how many users and the response time charted for various loads. If they do this then read it and size your hardware and software purchases appropriately for your new application.
However, if they, like many J2EE app vendors, say something like "Well, we have some internal numbers but nothing specific." or "We have some rough statistics, but our app is really, really fast." then run away and don't buy the product. These are code words for "We've never tested our app under load. It will probably crash under 5 users and cost you hundreds of thousands to debug and repair. Not to mention millions of huge, wasted 8 processor sun boxen."
Of course, I'm being a little facetious, but I've seen this many times. A few years ago I was looking at an app where they assured us that the app had been load tested and benchmarked. That the app ran clustered great and could support thousands of users on cheap hardware. This sounded promising so I asked to see the reports. They said they would email them right over. It's now two years later and I haven't seen any reports. Thankfully, we didn't buy the app.
Note- It looks like Oracle made a failed bid to buy MySql. Their justification was that at least when their database was getting eroded by open source, it would be their open source. This rationale also applied to the (still undenied) rumors of the JBoss buyout.
So the whole world is buzzing that Oracle could buy JBoss. JBoss is not commenting, and oracle is mum too. We won't know until Oracle coughs up the cash and Fleury becomes an Oracle VP.
What I'm interested in is what will happen to JBoss, and to Oracle Application Server (OAS). A could years ago, Oracle had a particularly horrific app server (called OAS as well). They bought OrionServer's source, renamed it OC4J and released a much better Oracle internet Application Server (Oracle iAS) using OC4J as its J2EE core. With 10g they still use OC4J and have made tons of changes (and reverted to the "OAS" name). If you look at OC4J it's really nice and simple. Pure java, the install is an unzip/jar, and it runs everywhere you have a JRE. OAS, however, is sort of a big, bulky app server. It has lots of features and it much more complicated than just simple OC4J.
Currently JBoss is pretty cool and easy to use. My fear is that Oracle will replace OC4J in their app server with JBoss guts. So basically Oracle would sort of complicate up JBoss and make it harder to use.
Not to mention my fear that Oracle would cease development of the open source JBoss and only develop for their required support versions. But I think this would kill their reputation. And that would hurt in their push to compete with IBM and Sun as "open source" friendly companies.
But then acquiring JBoss could just be an effort to get any market share for their app server, which is getting butchered by WebSphere, WebLogic and JBoss.
I'm surprised that Microsoft isn't trying to buy JBoss so they can directly compete with the app server market with an app server that runs J2EE and .Net.
Oh yeah, happy valentine's day all you nerds and nerd-wives. It's a scam, but of all the scams around, not the worst.
Back in my consulting days, I worked with a lot of guys who interviewed a lot. Once day this guy was telling of how he aced four rounds of interviewing to finally lose out because he knew nothing about Java threads.
It's possible to have a lucrative career in Java without ever dealing with threads. As long as you know how to use static and instance variables, the Java and J2EE APIs are pretty safe. But when you do have threading problems, it's a skill that you'll need to learn.
One way I experienced this wasn't in writing multi-Threaded apps, but in using third party frameworks and toolkits in my j2ee apps. Since everything goes into the ear, I'm redeploying to make changes. When I redeploy, the app server stops the application and then restarts it. Some thread based apps don't like being stopped properly. And of course it's a pain to track down the offending threads. So my app servers wouldn't shut down properly because of dangling threads.
A co-worker of mine recommended the very blandly named "StackTrace" tool from "Troubleshooting Tools For JavaTM" (no doubt named by the same team that came up with "Microsoft Office"). This tool is pretty useful for getting the thread dump of existing java processes. It's not OS, but you can get a free eval and can run it directly from their web site using their web start link.
After firing up this tool, it became pretty clear who the offending threads were and it was easy to patch them up. My particular problem was caused because the application was launching its own threads and setting them as non-daemon. This means the the jvm would wait for them to finish before exiting out. Usually, in a j2ee app, all launched threads (if you really must) should be set as daemon so if an admin tries to shut down the app server process, everything closes down cleanly.
The whole world jumped on SOA last year (or maybe it was 2004, AJAX was the ur-buzzword last year). But, it seems fairly evident that SOA is here to stay and is really useful.
I like a product by NextAxiom that lets you build and run web services very quickly.
But it's turning out that creating web services is harder than describing them to non-technical (or even technical) staff. Non-technicals hear "service" and kind of think of phone service or water service, or maybe an amorphous blob of code somewhere. Technicals hear "service" and think of OOP and components.
There's the common definition of a service as a "remotely invokable, discoverable function with validating inputs and outputs". Which is pretty vague. There are SOAP interfaces, but SOAP can come over HTTP, SMTP, JMS, anything really. But the benefit of services sort of sinks in the first time a service is reused from across the world with only a few minutes of looking it up and adding it to your flow/service.
At that point, businessies and techies spark up and want to use services.
That convoluted introduction brings me to my point: how can you categorize services? There is a pretty common term: "business service"- which is a coarse grained, well defined service that represents a business process. Business services aren't just CRUD, they represent the process from a business standpoint and the underlying technology is completely transparent.
But then what do you call those prevalent CRUD services that aren't pretty but get the job done? They are still valid, have a WSDL file out there somewhere and can be called over any protocols by any service consumers. But what do you call them? I'm at a loss because the distinction is important to service developers. Business services are perfect and can go into the UDDI (or whatever registry you favor), but these other ones are getting the job done but might have nasty interfaces that you don't really want your trading partners to know about.
So, any suggestions?
I've been working with Java for about 8 years or so and about once per year I come across someone who tries to write a custom ClassLoader. Here's my rule: If you aren't an app server, don't write a new ClassLoader implementation.
Normally the attempt is made for a very good reason, there's no "hot reloading" of Class definitions in Java by default. It sounds really easy to add some checks to a ClassLoader to see if a .class file's time stamp has changed or to add a flush system. I would love for this to exist. I would use it in a heart beat. Jakarta, BEA, JBoss, IBM and Oracle should add this functionality. But there's a reason why there is no project in Jakarta commons: ClassLoader hierarchies can be very complex.
Here's a recent scenario. I use OAS and WebSphere a lot to deploy and test my apps. I use JAAS for authentication. This normally works fine because OAS and WAS allow for you to change the ClassLoader hierarchy to check the current loader before the parent. So when I use JAAS within my ear, the LoginModule classes can be stored in the ear, instead of at the app server or system ClassLoader level. This is different than the default behavior that Sun suggest, but is highly useful. As without this function, I couldn't use JAAS to authenticate (I can't put my classes in the app or system classpath for reasons I'll explain some other day).
I'm using an embedded service engine, that I won't name but is pretty cool, and it builds it's own Threads and ClassLoaders. But it does it like Sun says you should, not how OAS and WAS do it. So this means that when I try to use classes that live in the app or system classpath (like OAS' and WAS' JAAS frameworks- implicitly called by LoginContext.login), I get a bunch of NoClasDefFoundErrors. And there's really no work around than to drop my entire class library in the app or system classpath, which I can't do. So this app that I have to embed and use is sucking a bit because it can't call out classes stored in the very ear that is running the app. The effort was good-hearted, but it caused me extra work to hack around it.
So my plea is this: Don't mess with ClassLoaders, you can't do it. If you are a genius and make the perfect ClassLoader, start a project in commons so the world can benefit and I don't have to muck around in near-genius code that doesn't quite make it.
I end up creating a bunch of j2ee apps that run on Oracle Application Server (OAS). When writing servlets/jsps it is frequently useful to get the URL or server name where the app is running. According to the java docs for HttpServletRequest, there are a couple of useful messages to get this info. I usually stick to getServerName for just the host name or getRequestURL for the full url (getRequestURI will have this info in a GET request). Both of these methods get their info from the HTTP Host header.
The trickiness comes with Apache, because no matter what the browser sends in the Host header, it replaces the host portion with the configured serverName. So if you pass:
Host: foo.bar.baz.com:8888but if Apache has "foo" in the httpd.conf, Apache will change the header to have
Recently I was working with some web service products. We run the engine on WebSphere and OAS on Solaris, AIX and zOS so I was trying to get it embedded in each of our deployments. WebSphere on AIX was giving some curious responses to our SOAP requests. It was reporting that the SOAP request was empty. On the client, the request looked fine, but once it got to the SOAP servlet, there request was empty. On OAS, there were no errors and the same request came through properly.
I set up OC4J to debug (since it's lightweight and easy to set up. I like it better than tomcat for j2ee dev and debugging.) It was throwing java.nio.charset.IllegalCharsetNameExceptions but the content-type header that was being sent was: Content-type: text/xml; charset="UTF-8"
Another architect found out what the problem was. It seems like the combination of WebSphere and IBM's JRE for AIX didn't like the quotes in the http header's charset. But the Sun JRE with WebSphere doesn't mind. Strange. So when we changed the header to: Content-type: text/xml; charset=UTF-8, the request came through properly.
According to W3C's RFC2616 for HTTP 1.1 there are no quotes around the charset attribute. Even though there are quotes in other W3C examples (like Example 9 in theW3C SOAP Primer).
So the moral of the story is that when you send SOAP requests over HTTP don't use quotes around the charset value of the around the Content-type header. Some app servers and JREs are flexible, but then again, some aren't.
OK, so I was recently reading about a lawsuit against wikipedia. This is a bit troubling because if they win,
wikipedia could get shut down. Typically this is pretty easy to do because the plaintiff takes the order to the ISP and down goes wikipedia. This would be a bad
So I started thinking about ways to get around this. If there was a decentralized type of web server that mirrored the data on client machines, it would be impossible to shut down.
The idea is that you have a tracker site(s) that store a copy of the site and info on the participating nodes. Each node dedicated a certain amount of disk space that is claimed by the client program (say 100MB). This bucket is used to store encrypted, signed copies of content fragments. The idea is that each node might have 10% of a piece of content, not the entire amount. And the content is encrypted and signed so the clients don't know what they have and can't change it.
When a client requests a page, they actually end up with 10 HTTP requests to assemble and display the web page/content.
One cool thing about this is that bandwidth would not be consumed by the server any more, so really popular sites wouldn't have to pay as much for bandwidth. I don't suffer from this problem, but maybe one day. If a site peaks, then that could be a couple thousand dollars worth of bandwidth. With P2PWeb, the server bandwidth is low as the clients make requests from multiple participating nodes.
I'll need to write a couple pieces of software:
So I work in a company with a couple of other architects. And these architects grew up designing j2ee apps using ejbs, servlets, jms, all that good stuff that j2ee makes you know. My company has lots of developers who grew up writing c and cobol and all those things where you spent lots of time on the program, not the architecture.
So, naturally this leads to some confusions. For example, when it comes time to design a process to send binary data down to the browser to display (like a pdf or word doc or something), instead of designing something nice like a binary downloader and displayer, someone writes a program that hard codes word documents down to the browser by writing the word doc to the server's file system and them streaming and opened file down to the browser. So it goes like this: 1) read data from db 2) write data into word doc on file system 3) open word doc 4) stream down to the browser. And once an architect notices this code and the possible negative side affects it may have, the code is already in production.
So, the whole point is that architects need to be checked by developers to make sure our designs don't suck and developers need to be checked by architects to make sure their code won't cause needless rework.
Also, something funny. I saw a resume (or maybe it was a job posting) that used the word "Architected" as in "Francke architected an enterprise level branding system." Is "design" a bad word or something? Will people get mixed up and think you're a photoshop jockey if you say "Francke designed an enterprise level branding system."?
Software development needs a new word for the architect role. I'm tired of people asking if I like Frank Lloyd Wright.
Sorry, this isn't a technical post, but here [flickr] and here [shutterfly] are some pictures from my recent trip to Vienna.
Of course the cell phones and plans are better over there and Sprint sucks for not having any phones that work in Europe.
Also, MARTA really seems inferior compared to all the other subway systems of the world. Shame on you Georgia.
So, since my project develops a lot of J2EE apps for various servers we do a not of debugging and profiling of J2EE apps on various app servers.
Currently we use OptimizeIt from Borland for some stuff, but we're thinking of adding a new tool to get better profiling data. One of our senior architects narrowed it down to JProbe from Quest and Introscope from Wily Technologies.
We need to profile on AIX, Solaris, HPUX, Windows and zOS on OAS, OC4J, Tomcat, WAS and WLS. Neither of these products does everything but based on the documentation, it seems Wily comes pretty close (JProbe only works on RedHat's zLinux for zOS).
So the first step is to get price estimates and evals. This should be pretty easy, but proved to be a monumental pain in the rear end. I guess this stuff would cost between $5-20k so I expected to get pricing info online. Guess again, neither site provided pricing info online.
So I got out my phone and started calling. Both had regional sales reps who were out of the office and no one else could give me pricing information. So I placed messages and waited. The JProbe guy was pretty responsive and I talked to him and got all the info I need by the next day. Wily proved a bit more annoying.
I've gotten a couple of emails from the sales rep and a promise or two for him to call me, but it's been a week now and no contact.
So my question is, why the heck do I have to talk to sales people to get pricing and licensing info? I'm not buying some massive enterprise system. I just want some java software. It should be painless.
So short story, my budget is up in the air and I'm really wishing there was a good open source profiling tool out there. I think OS is taking off just because there's none of these stupid pricing frustrations.
Today, I discovered that my project was using three different date formats for various services used throughout the enterprise. In the past, our Java SOA just used java.util.Date objects and serialized them through remote calls and everything was fine.
Now, with Web Services, you run into different services that use different date formats. So you're stuck with some String representation. I've seen milliseconds from epoch, dd-MMM-yyyy, yyyyMMdd, dd/MM/yyyy, etc etc. And it varies a little partner by partner.
The W3C actually has a date data type defined in May 2001 Recommendation which looks like yyyy-mm-dd. So my project will be standardizing on the W3C format, but I hope that all the other projects out there will standardize on it too.
Really, the date format can be specified in the service schema, but usually I just see it as a String and it's up to the developer to read through documentation to figure out the appropriate format.
So, through an altogether comical chain of events, I am my project's "z/OS WebSphere guy" sort of. So I frequently spend time working on our apps on z/OS and/or WebSphere (we've also got some instances on other OSes). The cool news is that I got to go to WebSphere training in Denver, the downside is that I have to work with WebSphere on z/OS.
One of the most interesting things that you come upon with WebSphere is that it runs Java in ASCII mode, reading and writing out files in ASCII. z/OS stores its files in EBCDIC. So this basically means that if you are telnetting into USS, you can't really view or edit files. If you try to view an ASCII file through USS using vi, you get something like this:
iconv -f ISO8859-1 -t IBM-1047 file.name | more
So I got asked by a biz dev guy to help out and give a presentation at my company's sesquiannual product trade show/ conference. I'm giving a presentation on performance tuning and monitoring of the architecture my company uses to build and run application stacks.
So, I'm you're typical geek guy who is pretty bad at PowerPoint. I follow all the normal rules (you know, 3-4 points per slide, don't put sentences, don't read off the slide, etc etc), but my ppts just aren't as flashy as some of the really cool ones I've seen. From a knowledge standpoint they're ok and I've got the whole public speaking thing down, but I hope the audience isn't expecting cool graphics and lots of clip art.
But this leads me to some really terrible presentations I've seen by brilliant developers and architects. Sort of the ones where you can tell how bad it is because even the people without blackberries start having thumb wars to pass the time. I'm hoping it won't be that bad.
Someone at my company recently resigned. He's a pretty funny guy who will be missed, but he was telling stories of previous programmers from time long passed. This guy was pretty outspoken and once exclaimed to his manager "Who makes these decisions? Are you a manager or a damager?" (the manager had just told the Board of Directors that we were using an arbitrary technology). So I love the word "damager" and hope I never encounter one (so far I've been blessed with skillful and caring managers).
For all the crazy managers I've worked for many years ago, I'm at a bit of a loss for stories right now. All I can think of is the CEO I worked with who didn't want me to hire someone "Because he's from Ohio." and another guy because his palm had perspiration.
So of course everyone who writes in Java is familiar with "write once, run anywhere" and this has really helped Java get in a lot of doors. I guess this might work if you have some console app or something, but in the real world there's a lot of time spent making sure your Java (and especially J2EE applications) run on various OSes, JREs and application servers.
Today I was conducting a dismal interview where I asked about the candidate's experience with porting a J2EE application from WebLogic to WebSphere (which was listed on his resume). The candidate said that it was very easy and he just deployed his application with no problems or changes necessary, he had no changes made for the application to run properly. This was one of many bad signs for the candidate.
I really hate WebSphere and I really hate zOS (MVS). Because of this, the fates conspire to constantly make me port applications to WebSphere (I finished my third port of a J2EE app to WebSphere). Porting to WebSphere isn't the worse thing in the world. I just had to change how we read the JNDI bindings for our EJBs (which will be easier once I remove all the EJBs from the application), redo the data sources, change the JAAS configuration files I was set and the app was up and running. The real challenge came with zOS and its EBCDIC charset.
zOS is bizarre, it costs tons of money, runs slowly, is difficult to administer and gets JVMs and app server releases after everybody else; but companies use it. IBM is trying to sell Linux on the mainframe, but I'm not sure why this is a good thing. Somehow it's better to spend $1MM on a huge mainframe running a bastard form of Linux than 500 Linux servers from Dell? I'd love to see the mangled TCO studies on that one.
Anyway, our customers use zOS and they buy my company's software so we support zOS (and make our app run pretty well actually). Adding character sets to all out File IO is probably a good idea anyway as it makes internationalization easier. This solved about 50% of our problems. We also had a third party app that specifically needed the Sun JCE for some encryption algorithms it uses. The IBM JRE on zOS (and AIX actually) of course doesn't come with the Sun JCE. It comes with the IBM JCE that has pretty much the same algorithms. Usually, I just package BouncyCastle's provider or IAIK's provider but I guess you can also just reference the provider name in your code. At one point I favored modifying the security.properties in the JRE to change the provider level, but this caused problems because 1)it confused customers and 2)the provider the customer has doesn't always have the strong algorithms I need (i.e. some wimpy Sun JCEs only have the 128bit AES/Rijndael and stuff like that).
In a nutshell, when porting your app to zOS be careful about character sets and when porting to non-Sun JCEs be careful about JRE differences (like xml support and cryptography support).
I've also had to port to/from OAS/OC4J, Tomcat, JBoss and WebLogic but they usually just boil down to internal security, data sources and JNDI settings.
So lately I've been participating in a lot of interviews. My company is looking for skilled developers who know about continuous integration using Ant, extreme programming and lots of other good stuff. While I'm in Bangalore the idea is to go through lots of interviews so we can find that perfect candidate.
Joel Spolsky (of JoelOnSoftware.com fame) has a good article on how important excellent programmers are. It's a valid idea and I generally agree. I'd rather have 3 excellent programmers than 100 mediocre ones. However, I'm sort of stuck in a situation where we can't hire rocket scientist programmers. We need solid developers with knowledge of Java/J2EE and various open source APIs.
In the olden days (1996-2000) the typical questions of "Difference between interface and an abstract class?", "Draw a UML class diagram for String.", "How would you design a connection pool?", "What is a race condition?", etc don't seem to be cutting out the chafe like they used to. I need a new set of questions that can really narrow down if developer's know their stuff, or are just really good at googling. (Don't get me wrong, googling is extremely important to a good developer.)
So I found my new favorite site, techinterview.org (by one of Spolsky's programmers). It's full of great, useful brain teaser questions that should occupy a good 10 minutes in an hour interview.
So JSR220 (EJB 3.0) is out for public review now. It has some neat new features in it to try to simplify EJB development.
I've been working with EJBs for a long time (back since 2000 or something). I've always discounted Entity EJBs for being terribly ineffective. For a 2000-2003 a typical interview question was "When would you use Entity beans?" with the correct answer being "Never" and the incorrect answer being Sun's blargh about how great they are.
Stateful beans were also kind of a waste, I've only seen one instance where you would actually use a stateful bean instead of some other more lightweight cache (like HttpSession for instance).
That left Stateless EJBs which were handy for transaction control and distributed apps. EJB2.0 brought us MDBs which are pretty handy for async processes (way better than the old db or file pollers).
In my current architecture that my company is using, we implement SOA with EJB service implementations. We did this for 3 reasons: 1) we get remote access to services and can distribute services across physical servers 2) we get container managed transactions 3) our previous technical director told the board of directors we were using EJBs so we had to.
So we have a bunch of services with EJB implementations. I was thinking about why we need the overhead of EJBs. I mean we never have remote access to our apps. There's no way we'd ever run the war and EJBs in separate JVMs. And our web service framework will provide remote access when someone really needs it (SOAP is just as fast as RMI). That kills issue #1.
A lot of our access goes to legacy systems that don't support J2C. So the EJB can't really control transactions. We have to code it ourselves. This kills #3.
Finally, number 3 isn't really relevant since the technical director is gone and the board has changed a lot since his edict.
So I'm thinking of just scrapping our EJBs and making our web apps call out to the standard, plain java classes. The service impls won't be distributable (away from the war) and they won't doa ny transaction magic for us, but they will be quick and easy to deploy. No more JNDI bindings and clustering settings. No more bean pools and crap like that. And now we can run in TomCat, which makes life easier (and cheaper).
I wonder if the spread of Web Services is really going to kill EJB. As most of the EJB features are overlapping with features provided by WS engines. Let me know if you're using EJBs and if so, why.
Oh yeah, IntelliJ 5.0 from IDEA just came out. It's the best IDE in the world (for any language). I know it's better than Eclipse 3.1, but I can't really remember why. I know it's more than Eclipse's stupid compiler.
This post is only technical in the sense that a lot of programmers project managers travel and so have a need for excellent, durable luggage that will survive the madnesses of flying around and dealing with complex projects, clients, everything else.
I started travelling a couple weeks a year in 2003. Being a relative youngster my luggage was crap and I needed something new. I wanted 1) Something that fits as a carry on 2) something that will keep a dress shirt and pants relatively wrinkle free 3) lots of pockets and stuff 4) strong and durable. This ended up being a bit difficult as I found the regular opinion sites (http://www.epinions.com) didn't help me out too much. I ended up getting suckered into an overpriced Tumi 22 inch "suiter" ballistic nylon packing case that will hold about 2 weeks worth of clothes if you roll. This has suited me well on 1-2 week trips to San Francisco, Florida, Bahamas, Frane, Italy, England, Denver, New York, Raleigh and a couple other trips. The thing is light, rolls well, sturdy. I really like it. The only downside was that it was $500. I guess it's not so bad to have that bag thing down. The salesperson assured me that it's the only bag I'll ever buy.
But now I'm going on an extended trip of a month (to India) and I need something bigger. I was just going to get the bigger, must-be-checked Tumi 28 inch packing case, but I ended up seeing a really cool case. I saw the Halliburton Zero line (no, not the Cheny Halliburton). They make a 29 inch aluminum cased roller bag that is really cool. I almost bought it but it was a little bit more ($1000) than the Tumi and it folded in half so you couldn't actually fit as much stuff into it. Oh well, I went for the big Tumi and I'm pretty happy. It has a good lost bag policy and a lifetime warranty. Plus it's study enough to sit on in an airport and I think I can fit my daughter inside of it.
I had this dream of buying a huge steamer trunk like in Joe Versus the Volcano, but couldn't find any place in Atlanta that stocks them.
Oh well, I bought the giant Tumi for $900 and I'll be packing it full of books (software and fiction) and hauling it across the world. I'll be checking out the new features in Eclipse 3.1 and comparing it to my new install of IDEA IntelliJ 4.5 on the flight and trying to refactor all the EJBs out of my company's architecture. I'm really trying to love Eclipse because of the whole open source thing, but IntelliJ is so freaking awesome.
This post isn't about copyrights or patents or anything like this. It's about the bond between a programmer and the code they write. More specifically about how this bond is weakening, if not already gone.
When I started out writing code about 10 years ago or so, when I wrote something I supported it. So if another programmer or a user found a bug, they told me about it and I fixed it. I would feel bad if bugs were found in my program. I wanted to write the bug-free, excellent software. This was a pretty common characteristic of programmers way back in the 90s. Programmers who wouldn't test or fix their code were usually looked down on and fired. Plus they were usually really annoying.
Lately, I've noticed reams of software checked in by fellow programmers at my unnamed employer that is just terrible. Both by onshore and offshore programmers. The stuff either just doesn't run (throws exceptions on execution) or is terrible to the point that the features are just missing.
An example... We use Hibernate for our Data Access/ORM. Someone checked in Hibernate classes and mapping files and client code to execute it, but the query threw an exception. This means that they never even ran their query. This is pretty sucky in and of itself, but the worst part is that developers then claim they are too busy on something else and that the bug should be fixed by someone else who has more time.
So now the lesson is that you can write crappy code and then claim to be too busy. Then the project manager will assign the cleanup to another programmer. It's a terrible cycle that churns out terrible software late and buggy. It seems like a bad way to do business but I see a lot of it.
I'll be in India next month so my post will probably not exist or be posted at a strange hour.
So I submitted two bugs to Sun regarding the 1.4 JDK. One is more important than the other.
The first bug has to do with how the java.lang.reflect.Proxy class works. The Proxy class is really handy if you ever need to do multiple inheritance or other nifty Java tricks. If you want to know more you can google or look at the decent article over at devx.com.
The problem is that when you pass an implementation of Proxy from one jvm to another, you get some ClassNotFoundExceptions because the contract interface is in the application classloader and not the system class loader. This is because there is a special classloader for Proxy classes that does not use the current classloader, I suspect it uses the system classloader.
This was causing some major problems in a project I was involved in two years ago because we used Proxy classes all over the place for our transfer object. It was actually the brainchild of this developer I work with, Ali, who came up with a handy way of creating stateful transfer objects by just defining the interface class. This worked great until we tried passing the Proxy transfer objects from JUnit to the web or ejb container. It started blowing up.
I ended up patching the custom ObjectInputStream that BEA used for WebLogic and we got it to work, but I couldn't patch the default JDK/JRE ObjectInputStream used to serialize/deserialize. I submitted a bug to Sun (208933) so they could correct it. It's been about two years and the new 1.5/5.0 JDK has come out and this bug still exists. It's pretty frustrating. Had this been apache or struts or jboss or something and a rather serious bug existed, at least I could fix it locally. I can't wait until the Apache OS JDK comes out so we don't have to deal with those slackers at Sun.
Looking at the top 25 bugs, it looks like there are quite a few problems lingering around.
Oh yeah, my other bug was the continued usage of deprecated Calendar/Date methods in the zip package. A slightly cosmetic bug, but could be serious considering that the deprecated Date methods synchronize on an internal static Calendar. So every thread using the deprecated Date constructor will block. This can be bad when your app uses this to check date diffs (like WebLogic used to do to see if JSP/Servlets have changed in a specific interval).
I was always a little leery of friendster.com and sites like that. I'm married so I wasn't looking for online dating and stuff. And I seemed to be able to schedule my life and meet with friends without needing the help of a "social networking" site.
The other day (after demoing TIBCO's really cool Ajax framework) someone sent me an invite to LinkedIn.com. I ignored it at first, but this guy also sent an invite to a couple co-workers who then sent me new invites.
So I succumbed and joined up. LinkedIn isn't a social networking site, it's a business networking site. I've only been playing around with it for a few days, but it's really interesting. I wish we had this back when I was doing consulting. It would have made finding contracts a lot easier.
It's interesting how many people you end up being linked to. No links to Bill Gates or Torvalds yet, but I'm still hoping. My goal now is to get close to someone who can hire me to google.
LinkedIn is free and everything but I suspect they'll start charging soon for some of the advanced features. I just wish they would get a good graphical representation of the links between contacts. It's sad, but even though I crank out tons of back-end code day in and day out, I'm a sucker for a good graphic. You can never pay too much for a good graphic design.
Next post is about why the bugs I submit to Sun never get fixed (two years and counting).
So I've been looking at AJaX frameworks over the past few weeks (actually a bunch of coworkers have been doing most of the work, but I'm interested). It seems like everybody has recently jumped onto the bandwagon and are now declaring themselves AJaX frameworks/architectures/founders/etc/etc.
No theoretically the cool uis could have been done with traditional html from the server kind of apps, but AJaX makes it easier because you can move stuff around just like a rich windows app. This was a pain in the olden days because you had to have lots of server hits and stuff and the user didn't like waiting for refreshes to complete to see new sections or get new content.
So there are a few up and comers but no one is perfect yet. I've looked at General Interface (now owned by TIBCO), Isomorphic, DreamFactory, BackBase and Bindows. I'll go into more detail in a future post but for now I'll flip when I see a good open source IDE/Component set that talks with xml services and renders well on IE6 and Firefox.
I bought the new Kasabian cd at Criminal Records and got kind of annoyed that it is not actually a cd-rom at all. It looks like a cd-rom and played in my car cd player like a proper cd. But it has special software on it by SunnComm called "MediaMax" that is designed to prevent "unauthorized copies" (it also prevents authorized copies too).
Of course, the first thing I wanted to do it rip it into iTunes. Since I have autorun disabled, their stupid software never launched and I was able to rip it properly using iTunes and cdex.
I found this decent site that explains it all http://www.cs.princeton.edu/~jhalderm/cd3/.
It's rather annoying that a record company would attempt to screw up our fair use rights to create backups and use the music as we wish. It's only humorous now that BMG would pay good money for this marginally useless technology, but one day they will get it right and it will really suck.
So I've been travelling a bit. San Francisco last week and this week in Denver. Getting a tour of various hotels and airlines.
So, in pre jdk1.5 the StringBuffer class is one of the most widely used classes. Especially in Servlets and JSPs, this class is used everywhere. The problem is that it is thread-safe. This might be good for some people, but it is terrible for web performance. With Request/Response, buffers are never shared between threads so all that synchronization is wasted.
I looked around and couldn't find any open source non-synchronized StringBuffers and my projects can't run on jdk1.5 yet (to take advantage of FastStringBuilder) so I had to write a new non-thread safe/non-synchronized FastStringBuffer class. I also wrote a new FastStringWriter class to replace java.io.StringWriter that uses FastStringBuffer instead of StringBuffer.
So anyway, I need to clean up the class and submit it to jakarta commons lang or something. But it is a decent performance boost (about 50-200ms per page) that may help out your project.
So I've been trying to track down some performance issues with my applications on MVS, Unix and Windows. I'm not using any professional J2EE profilers yet, but a co-worker of mine wrote some perf logging stuff using Aspectwerkz' AOP stuff.
It bogs the app down, but I can get method level performance monitoring.
It's pretty tedious tracking down performance problems in a huge application.
So far, I've used Microsoft's Web Application Stress Tool. They released this thing years ago for free, and it's still about as good as Mercury's LoadRunner or Rational's load testing stuff. The advantage of MS' is that it is free and lightweight enough to run on any windows PC. It basically records HTTP scripts and runs them from 1-100 or so simultaneous "user" threads. It can kill any app pretty quickly and is useful for simulating a load on any app.
Any performance profiling tips are appreciated. Most of the stuff I've read is vague and stupid like "Look for memory leaks" or "Identify the bottleneck" blah blah blah.
I'm also reading David Mitchell's Ghostwritten and Rudyard Kipling's Kim. So far so good. Recommend something for when I finish these guys.
So I'm going on a cruise this week and I'll be out of the country. This probably means that the quality of this site will increase while I'm away.
Oh yeah, audioscrobbler.com is really cool. I just need modify the windows plug-in so it tracks songs listened to on the ipod.
I use MicroSoft Money 2005. I bought this version to get auto downloads from MBNA and Citibank. Of course this didn't work because each time I downloaded the statement it marked every transaction as brand new and autoaccepted them. This of course sucked as I had to go in and delete all the duplicate downloads every time it updated. I emailed Microsoft and they basically said that I'm screwed and that they have no fix for the problem. Wait for 2006 I guess.
Anyway, my workaround is to write my own app to screen scrape the data from MBNA and Citibank (and theoretically any site) and import into money. If anyone has any ideas let me know.
This is the first post for the new year.
I'm a software architect and I come across all sorts of technical bugs, techniques, fixes etc. I frequently find that I can't google for the answer and have to actually (shudder) figure it out for myself. To save other poor souls from having to suffer through the same pain, I'll get to post answers here so fellow googlers can find the answer.
So for any J2EE/Unix/Windows/MVS (I know I know, I thought it was extinct too)/Oracle/DB2/WebSphere/WebLogic/Oracle App Server/Open Source/etc etc ask me any questions or stay tuned as I update this site from time to time.