Thursday, January 17, 2013

SOA Software Guidelines: Service & Data Design, Strategy/Governance

This is a part of the blog series about (SOA) software guidelines. For the complete list of the guidelines (i.a. about design, security, performance, operations, database, coding, versioning) please refer to:

Service Design


  • Use mature technology, become late adopter, follow the major stream/use popular tools/technology. Cutting edge products usually is not as reliable as stable product. Avoid      using version 0.x product. Well accepted products usually have better user supports group (blogs, discussion groups etc). Vendors of successful products usually have been growth to big enough to have a resourceful customer supports.
  • Use well-accepted standard solutions (e.g. OAuth, OpenID, WS-Security, SAML, WS-Policy) for better operability & security, don't reinvent new wheels. Standard solutions are usually resulted from cooperation involving many developers so the design and implementation are better tested by the community.  For sensitive topics such as security or distributed transactions, it's difficult & more risky to build a bullet proof code by your own.
  • In general, avoid premature optimization: so build a prototype fast, then do performance tests, only redesign if the performance doesn't meet SLA. But if the risk-profile of the project deducts that the performance is very critical (e.g. options trading), you need to include optimization since earlier design (e.g. multi threading).
  • Validate architecture & design  early with prototyping and (performance) tests. Know the cost of specific design choices / features, be prepare to cut features / rework areas that do not meet specifications.
  • Don't be cheap. Think broader in term of ROI: don't be cheap but sacrificing long term scalability, maintenance, productivity and reliability. Man-hours cost much more than hardware, better to buy reliable high-performance hardware that will facilitate your team to work faster and less problematic. Buy better gateway/firewall that can (intelligently) reject DOS attacks and easy to configure. For your developers, buy "state of the art" PCs with 2 monitors and abundant RAM & storage so they can work smoothly.
  • Don't assume out of the box solution (e.g. my framework will take care everything including my security/scalability/transaction issues), test/verify the new bounds of operation (e.g. verify if the framework really secure your application) and verify the effects to other quality aspects (e.g. the addition of a new framework improve security but now hurt the performance under SLA).
  • In general scale up (buying better hardware) is easier than scale out (distribute workloads by adding more servers), so try scale up first whenever possible. Be specific about which resource to be scaled up (e.g. CPU, memory, networks). Scale out also adds some problems such as synchronization between servers, distributed transaction, how to define (horizontal) data partition, difficult recovery/fault tolerant implementation.
  • Optimize the application / query &  database design first (looks for problems such as locking problems, missing database index, Cartesian product query) before scale up / scale out or dig in to database tuning. See "Where to prioritize your effort regarding software performance":
  • Use separate environments: sandbox/playground, development, test, production.
  • Reuse design/configurations: if you have a successful design/configuration use it again in another places. Reuse means fewer permutations of configurations thus easier to manage / learn. It's also more robust since it has fewer things to get wrong and has already be tested.
  • Scope your requirements. One solution for one problem, avoid trying to build application capable of all things for all people. Don't try to anticipate every problems in future since it's difficult to get accurate assumptions about the future.
  • Design is a combination of science and art. Make decision based on facts, if the variables are not known use educated guess (e.g. based on historical usage data).
  • Design metrics / measurable objectives to validate your design parameters (e.g. performance throughput Mbps, conversion rate of our web-shop, etc.)
  • Recognize and remove design contradictions as early as possible if possible. Write the trade off decisions and their implications  (e.g. security vs performance).
  • Beware of firewalls overused: adding frustration & time lost during development, test en production. Lead to unnecessary complex solutions (workarounds via Bastion host) or even render the functional requirement impossible. You might leave low value static contents (e.g. css, static images) without firewall.
  • Develop in-house vs buying a complex out of the box:
Principles of Service Orientation (
·         Standardized Contract (e.g. wsdl, xsd schema, ws-policy).  Advantages: interoperability, reduce transformation due to consistent data model, self documented (the purpose, capabilities e.g. Qos, message contents)
·         Loose Coupling
o       Advantages:  maintainability, independent (the interfaces can evolve over time with minimum impact to each other), scalable (e.g. easier for physical partitions/clustering)
o       Characteristics: decouple service contract/interface from implementation/technology details, asynchronous message based (e.g. JMS) instead of synchronous RPC based, separation of logical layers.
·         Service Abstraction: separate service contract/interface from implementation, hide technology & business logic details.
·         Reusability
o       Advantages: faster and cheaper for realization of new services
o       Characteristics: agnostic service, generic & extensible contract. Avoid any messages, operations, or logic that are consumer or technological specific, reuse services (e.g. for compositions), reuse components from other projects (e.g. xsd/common data model, wsdl, operations, queues)
o       How do you anticipate this service being reused? How can modifications be minimized?
·         Service autonomy (has a high level of control over its runtime environment). Advantages: increase reliability, performance, predictability
·         Statelessness. Advantages: scalability, performance, more reusability due to agnostic / less affinity.
·         Discoverability (e.g. using UDDI)
·         Composability (e.g. using BPEL)

Design principles

  • Where are the tightest couplings/dependencies with other services, other systems, etc?
  • Use layered design (e.g. presentation layer, business logic layer) for cohesion, maintainability & scalability.
  • What patterns have been employed? (see e.g. for SOA patterns or Hohpe's book for messaging patterns)
  • Aspect oriented programming, separation of cross-cutting concerns from the main codes (e.g. declarative security using policies)
  • Use progressive processing instead of blocking until the entire result finish, e.g. incremental updates (using JMS topic per update-event instead of bulk update the whole database every night),  render GUI progressively with separate threads (using Ajax for example).  Use paging GUI (e.g. display only 20 results and provide a "next" button). This strategy will improve performance, user experience, responsiveness and availability.


  • Simplify the requirement (80/20 Pareto prioritizing), design and implementation.
  • Cut specifications / features which are unnecessary.
  • Minimalist design and implementation. Does this service fulfill the goal of the service with minimum efforts?
  • Minimize numbers of components & connections.
  • Minimize number of  applications & vendors to avoid integration problems.
  • Avoid duplications of  provisioning (e.g. authentication data in both LDAP and database) since then you have extra problem to synchronize them.


Design decisions

  • For every design decision evaluate its impacts to functional & non-functional requirements (e.g. performance, security, maintainability), impact to projectplan/constraints (cost, deadline, resources & skills).
  • Prioritize your non-functional requirement for design trade-off (e.g. if performance is above security & reliability you might avoid message encryption & persistence jms)
  • Which integration style this service uses (e.g. RPC-like web service, file transfer, shared database, messaging)?
  • Does this service wrap legacy application or database? Does this application/database already provide out of box SOA integration capabilities (e.g. web services, messaging trigger) that I can use? Can I replace the underlying application/database with another vendor/version without much change propagations to other services?
  • Where the services will be deployed? e.g. cloud providers, internal virtual machines, distributed servers around the world, local PC, etc.
  • Which trade off do you choose about message structuresrigid contract (wsdl soap) vs flexibility (e.g. no wsdl, REST, generic keys-values inputs) considering security / message validation, performance, extendibility, chains of wsdl/xsd changes?
  • Avoid  concurrency programming if possible since it's error prone. If you decide to use concurrency make sure that multi-threading problems (race, deadlocks) have been addressed (tested ok).
  • Which transport protocols do you use (e.g. http-soap, http-rest, jms) and why? Do you need to wrap the protocol (e.g. sync http to wrap asynch jms)? Aware that your platform perhaps has non standard protocols that offer better performance (e.g. Weblogic T3, Weblogic SOA-Direct).
  • Do you consider event driven paradigm (e.g. jms topic)?
  • Understand the features of your frameworks (e.g. security, transactions, failover, cache, monitoring, load balancing/clustering/parallelizing, logging). Using framework features will simplify your design so you don't have to reimplement those features.) Read the vendor recommendation / best practices documents.
Requirement management
  • Have you followed the standards & laws? e.g. SOA guidelines document in your organization, privacy laws as Sarbanes-Oxley(US)/Wet bescherming persoonsgegevens (Netherlands), etc.
  • Is there any real time requirements (e.g. nuclear plant control system/ TUD-IRI)?
  • What is the functional category of this service?
    • Business Process e.g. StudentRegistrationBPELService
    • Business Entity e.g. OsirisService
    • Business Functions e.g. PublishStudentService
    • Utility, e.g. EmailService, Scheduler, LoggingService
    • Security Service (handle identity, authorization)
    • What is the landscape level of this service: Atomic, Domain, Enterprise? (see The Definitive Guide to SOA by Davies
  • Does this service fulfill the functional & non functional requirements defined in the specification document?
  • Avoid constantly changing requirements. Avoid feature creep.

Asynchronous pattern

The benefits of async messages:
    • avoid blocking thus improve responsiveness & throughput (for better performance, user experience & availability)
    • improve reliability / fault tolerant with persistent queue & durable subscriber
    • loose coupling between producer & consumer (queue) or publisher & subscriber (topic)
    • defer heavy processing to nonpeak period (improve performance & availability)
The drawbacks of async messages are the complexity of implementation:
·         how to persist messages in the queue in case of server fault
·         how to handle if the messages is not delivered (e.g. fault in the subscribers)
·         how to handle duplicate messages or out of sequence messages
·         how to inform the caller about the status of processing (e.g. via a status queue or a status database table)
However with the advances of enterprise integration frameworks (e.g. Oracle OSB), it's becoming easier to deal with these problems.
Beware that some process need direct feedback (e.g. form authentication, form validation) where a synchronous pattern is more appropriate.

Software process / governance

  • Establish standard / guidelines for your department, summarize them into checklists for design & code review.
  • Include design & code review in your software development process. See
  • Establish architecture policies for your department. Establish a clear role who will guard the architecture policies and guidelines e.g. the architects using design/code review.
  • For maintainability & governance: limit the technologies used in the projects. Avoid constantly changing technology while still open to the new ideas. Provide stability for developers to master the technology.
  • Establish change control. More changes means more chances of failures. You might need to establish a change committees to approve the change request. A change request consists of why, risks, back-out/undo plan, version control of configuration files, schedule. Communicate the schedule with the affected parties before.
  • Use SLA to explicitly write down user expectation: availability (max downtime), hours of services (critical working hours, weekends/holidays), max users/connections, response time/processing rate (normal mode, degradation mode), monitoring/alert, crisis management (procedures to handle failures, who/how to get informed, which services/resources have priorities), escalation policy, backup / recovery procedure (how much to backup, how often, how long keep, how long to recover), limitations (e.g. dependency to external cloud vendor). Quality comes at a price: be realistic when negotiating SLA e.g. 99.99% availability means that you have to guarantee less than 1 hour downtime per year, which is quite tough to achieve.
  • Have contingency plan / crisis management document ready: procedures to handle failures, failover scripts, how to reboot, how to restart in safe mode, configuration backup/restore, how to turn-on/turn-off/undeploy/deploy modules/services/drivers, who/how to get informed, which services/resources have priorities (e.g. telephony service, logging service, security services).  Have this document in multiple copies in printed version (the intranet and printer may not work during crisis). The crisis team should have exercised the procedures (e.g. under simulated DOS attack) and measured the metrics during the exercise (e.g. downtime, throughput during degradation mode). Plan team vacation such that at least one of the crisis team member always available. Some organizations need 24/7 fulltime dedicated monitoring & support team.
  • Document incidents (root causes, solutions, prevention, lesson to learn), add the incident handling procedures to the crisis management document.
  • Establish consistent communication channel between architects, developer team, external developers, testers, management, stakeholders/clients (e.g. documentation trac wiki).
  • Communicate design considerations, issues to the project team/developers and stakeholder.


Data design
  • Do you use common data model? How to enforce consistent terminologies and semantics between terms used in different systems (databases, Ldap, ERP, external apps, etc.)
  • Use simple SOAP data types (e.g. int). A new datatype introduces overhead during (de)serialization of the messages. Don't use xsd:any type.
  • How the data will be synchronized within different databases, ldap directory, external systems?
  • Use MTOM/XOP instead of SWA or inline content for transmitting attachments / large binaries
  • Consider claim and check patterns to avoid data processing:
  • Consider different ways to store data: rdbms-database (for high relationships, ACID), NoSQL/document store (simple key-value data, better scalability/ easier to split), file system (better for read only data).
  • How will your service handle localization/globalization, different encodings/formats?      e.g. unicode for diacritics/Cyrillic/Chinese, the DateTime format in your web service vs the format in the database/ldap/external-webservice. Do you consider the effect of time zones?
  • If you use file-based integration: is the file permission right (e.g. access denied for apache-user trying to read file generated by weblogic-user)? is the encoding right?
  • Are transactions required? Are compensations/rollbacks required?
  • Aware the complexity/scalability of algorithm you use.
  • Chose data structure based on the usage need (e.g. tree can be faster for search but generally slow for insert/update) and the particular properties  (size, ordered, set, hash-key, etc) that motivate you to choose that structure.
  • Choose the right data format so that the transformation need is minimize
Keep data small
  • Use short element/attribute names to reduce xml size.
  • Limit the level of XML nesting.
  • delete old / low value data, move old data to backup/lower tier storage. Determine until when the old data is keep in the production, until when the old data is keep in the backup / lower tier storage. Storage (& its maintenance)  is not free.
  • reduce the data with transformation (e.g. summary, data cleaning, sampling)
  • Consider message compression (e.g. define gzip content-encoding in http header)
  • Structure your XSD such that you minimize the information transmitted: use "optional element", minimize data for a specific purpose (e.g. you don't need to send nationality data if you just want to transmit the email address of a person)
  • SOAP webservice style: use literal instead of encoding. Encoding style has several problems: compatibility, extra data traffic for the embedded data types
  • Use simplest/smallest data type (e.g.  use short int  instead of long int if  a short is already good enough)
Data management
  • Can you categorize the quality level of the data? e.g. sensitive data (encrypted/SSL), non sensitive data (not encrypted nor signed to increase performance), critical data (live backup/redundancy), non critical (backup less often), reliable message (transported with persistent queue for guarantee delivery), time-critical/real-time data (use dedicated high performance hardware/networks).
  • How the data will be backup? How you secure the backup data?
  • Are the content correct/accurate? spelling checked?
  • Logically group data for readability & maintainability (e.g. use person struc to contain data about a person: name, birth date, etc)
  • Data is not free: aware the cost of data (for network bandwidth, processing, storage, people/power/space to maintain storage, backup cost). Eliminate low value & high cost data, sample the low value en low cost data.
  • Tiered storage based on value and age.
Source: Steve's blogs

Any comments are welcome :)



·        Code complete by McConnell
·         Report review & test checklist, university washington
·         IEEE Standard for Software Reviews and Audits 1028-2008
·         The Definitive Guide to SOA by Davies
·         Enterprise Integration Patterns by Hohpe
·         SOA Principles of Service Design by Erl
·         Improving .NET Application Performance and Scalability by Meier
·         Patterns of Enterprise Application Architecture by Fowler
·          Hacking Exposed Web Applications by Scambray
·          OWASP Web Service Security Cheat Sheet
·          OWASP Code Review Guide
·          Improving Web Services Security (Microsoft patterns & practices) by Meier
·          Improving .NET Application Performance and Scalability by Meier
·          Concurrency Series: Basics of Transaction Isolation Levels by Sunil Agarwal

No comments: