Monday, March 23, 2015

Enterprise Design Patterns and Practices (Boise Code Camp 2015)

At this years Boise Code Camp I presented on Enterprise Design Patterns and Practices, afterwords I was ask to post my slide deck and notes, if there is interest I'm thinking of doing a blog series on this.

Fundamental principals

 S.O.L.I.D. Principals

·         A class/method/tier should have only a single responsibility it should do one thing, and do it well, even if that is telling other things what to do.
·         Software entities should be open for extension, but closed for modification, and always favor extension over inheritance.
·         Design by contract, you should never need to care what the object really is.
·         Using multiple client-specific interfaces are better than one general-purpose interface.
·         Depend use abstractions and not concreate implementations. Use a factory pattern, DI container, Lazy Load properties, etc.
These are the fundamental principles of OO development and apply equally for application architecture.

Be strongly typed and loosely coupled

By creating strongly typed interactions between objects you are enforcing a consistent behavior that will help prevent difficult to find runtime errors.  Decoupling objects in your application allows you to change the underlying object without the changing the caller.

The 2 biggest problems you will face

Golden Hammer

Having go to tools are great, but just because something can do it, doesn't mean it should be used. 

Silver Bullets

Everything from here on down may/can/possibly help, but will not solve all of your problems, always looking for the next technology to solve all of your problems will only end in pain.  These are tools in a toolbox, to be used appropriately.  If used incorrectly they will bring pain, lot of pain.

Automation, Automation, Automation

If it can be reasonably automated, it should be, Computers are really good at being consistent, and humans are not

Build server & source control

Nothing fundamentally improve your code more. (Jenkins, TFS, Team City, Bamboo, Apache Continuum, etc.) Nothing goes out that isn't in source control, EVER!

Application Deployment

Using deployment tools for pushing changes (MSDeploy, Octopus, etc.), copy past in not a recommended deployment methodology.

Database Deployment/Migration

Database changes are code changes too.  Use a migration framework (EF Migrations, Fluent Migrations, etc.) or a database change management tool (Visual Studio Database Project, Redgate tools, DB Deploy, Ready Roll, etc.) 

Testing

Automating your unit, behavior, integration, load and UX tests allows you to find problems before your clients/customers.

Instrumentation

Regardless of what you think might be going on, without instrumentation your blind.

Logging

This is the first and easiest way to find out what is going on under the hood, don’t write your own! Unless you really need to (log4net, nLog, Enterprise Library, SmartInspect, etc.)

Memory Profiler

While a logger can tell you what is going on in the code, a memory profiler (redgate ants, JetBrains Memory Profiler, etc.)

Site monitoring and Analytics

Just because your application is live doesn't mean you can stop paying attention to it, knowing how many people are interacting with it, how they are interacting with it and when will let you know what has happened, what is currently happening, in order to guess what is going to happen.

Leveraging 3rd party tools

Focus on your core features, leverage 3rd party to deal with the grunt work.

Abstract your 3rd party includes

Using 3rd party tools are great and will save you from re-inventing the wheel, but to keep from being locked in always abstract away the implantation when you can.  In the future the library may no longer be supported, licensing problems, etc. being able to quickly insert a replacement will save you in the long run.

Building for Extendibility

Separate your application into logical segments or tiers.   

n-Tier Design

The 3 major application tiers are (but not limited to)
·         UI – For displaying or transmitting information (webpage, API endpoint, etc.)
·         Domain or Business – This is where the work and decisions are made
·         Data Access – Where the data comes from (DB call, Web Service call, file read, etc.)
A very important rule is tiers can only see the tier next to them, the UI should have no concept of where the data it’s displaying is coming from, and all it cares about is it made a request from the Domain and got data back.  This applies to the Data Access as well, all it knows or cares about is returning data requested by the Domain.  This logically separation of your logic allows you to reuse and refactor your application cleanly without fear of breaking changes in an outside tier. 

DTO, POCO, and Model objects

With each tier isolated from each other we are going to need a way to pass information back and forth, enter the DTO (Data Transport Object).  Basically these are classes that hold information, they have little to no logic and know nothing about where they are, where they came from, or where they are going.

Separate Update, Select and Work Logic

Domain classes can get big fast, by keeping them logically separated into a more singular function is keeping in line with the single responsibility principal and keep them more manageable.   
This can be applied to Data access classes as well even if it’s as simple as having 2 interfaces on a data repository class, one for updates and one for selects.  This might seem silly until you start working with publisher and subscriber databases.

Do not over abstract

While using abstractions to separate and decouple your logic is good, if overused/abused these good intentions quickly paves a road abstraction hell.  Where adding a single property to your result set to display requires modifying 5 DTO objects, 4 file mappers, yes I know of an application that does this.

 

Building for Scalability

Dealing with Load

Caching   

·         Micro-caching – for lots of requests for the same data needing relatively real time data(5-10 sec)
·         Memory caching – for things that don’t change that often (1+ hours): User information, product data, GEO data (city, state, LAT, LOG, etc.)
·         DB caching – For  sharing your Cache

Spreading the Load

With your logic separated and decoupled it’s easy to farm your pain points to other hardware, think out instead of bigger.
·         Move expensive functionality (Encryption, PDF/Image creation, etc.) to a dedicated system/systems to prevent load on your main server/servers.
·         Using publisher & subscriber data bases, to reduce the load on your primary database, move selects from the main to the subscribers, this is also a good practice for reporting servers, running a massive metrics report for the sales team should not affect production performance.
·         Aggregate and pre-process data, just because you saved the data in a specific DB Schema don’t mean you have to keep it that way.  Having a job that copies data to a more flattened table for faster searching can greatly improve performance, even more so when aggregating from a SQL server (MYSQL, MSSQL, Oracle) to NoSQL( CouchDB, MongoDB, etc)

Queuing up

If you don’t require an immediate synchronous response message queues are a great way to keep your system fast, prevent data loss (messages persist until client acknowledges receipt), and distribute workload. Using a Service Bus takes queues to the next level by creating an architecture for distributing work load.

CDN (Content Delivery Network)

Keeping your static files fast, having a striped down server that only serves up static like images, css, JavaScript, etc. these can also be geo-located for even better performance.

Not everything needs to go to the DB or come from it

Databases are great for storing data and pulling data but they aren't the only solution.  Lots of data submitted by the use doesn't need to be persisted in the database.  There are better ways to localize your site then storing it in the database.  Customer uploading images are another area where keeping the Meta data in a database is a good idea but the image itself should probably go to disk to disk (a CDN would be great for this).

Redundancy

Hardware fails, systems crash, operating systems need updating, and redundant systems keep you online.