What is “Bridging the Gap”?

 The movement towards an office with less paper and more efficiency can be quite difficult, and with the wrong tools can end in failure.  The key challenge is a process I call “Bridging the Gap”, which uses several applications to create a bridge between the physical and digital world, and helps create a seamless process.  So what is required?  How do you create the bridge?

 

On one side of the gap, you have your physical environment: file cabinets, inboxes, stacks of folders on desks, etc.  There are two components that facilitate the crossing:

  • Scanning Hardware – scanners allow the conversion of paper documents into digital documents or images.  Organizations can use scanning copiers, fax machines or dedicated scanners to digitize.
  • Capture Software – capture software works with the scanning hardware to create an efficient and automated bridging process.  It controls the flow of digitized documents, standardizing how they are routed, and using OCR, Barcodes, Advanced Data Extraction (ADE) and other features to automate the collection of information.  It spans the gap and creates a connection to the other side or the repository.

Once the gap has been spanned, the documents need to land somewhere, just as physical documents land in a file cabinet, inbox on someones desk or another location in the organization.  Below are the two components that exist on the far side of the gap:


  • Workflow Software – think of this as the digital inbox and outbox…on steroids.  Workflow Software is utilized to create a digital mirror of your physical processes.  It can move around files, create approval steps, automatically email and perform logic that usually requires intervention by a human.  Some oraganizations dont have this entity on the other side of the gap.
  • Repository –  Think of the repository as a temporary and permanent file cabinet that can hold files during a workflow process, or as an archive copy once the whole process is complete.  You can search, sort and organize, print, distribute and copy.  Most repositories can allow full text search, if the capture software has created a searchable file format, and also allow column based searching for specific criteria.

I have seen many organizations try and bridge the gap, and not have one of the pieces above, or a piece that cannot suit all their needs.  A missing component can impact the overall value of the system.  For example, take a scanning copier that an AP department uses to scan invoices.  They email themselves the scans, open them, rename them and then save them into their repository.  Without capture software to automate the naming and routing, this is a highly inefficient process.  Without capture, files are not made searchable through OCR, and this can also reduce effiency during search.  Another example might be the lack of a repository that can provide all the bits and pieces an organization may require.  Take the organization that just saves PDFs to a network directory.  This may be fine for many organizations that merely need a simple archive to house their files.  But what about an audit event, or legal issue that may require extensive searching and sorting?


“Briding the Gap” and creating an office with less paper can provide an organization countless benefits with proper planning and design, and the inclusion of all the above components.

SharePoint Scanning Planning – Part 4 – Document Scanning Models

Document Scanning Models

After doing some planning on the hardware types and document scanning volumes, the next step would be to examine what type of model you need to deploy.  There are typically 3 standard  models for document scanning and capture: Centralized, De-centralized and Distributed.

Each model has its own pros/cons, and below I will examine each, and dive into some detail.

Centralized

Ah, the centralized model.  Some call this old school scanning and capture, as for many years, this was the only way to get the job done, and convert your paper to digital form.  This model provides a centralized scanning center to provide mass conversion for the organization.  The operation can be run by in house personnel, be managed by a services provider in house, or be outsourced to a scanning service bureau.  It requires high volume/high speed hardware, and typically utilizes advanced capture software to allow for the utmost in automation and efficiency.  The software and hardware operators are typically highly trained, and there are usually only a few of them.  Paper and/or digital media is shipped to the centralized location and processed through a set, standardized capture workflow.

Centralized Pros

  • Easily standardized process due to a limited number of skilled/trained scan operators
  • High speed hardware/software results in minimal processing time once paper is received
  • Centralized reporting and control of overall process
  • No loading on WAN infrastructure
  • Centralized backup and restore

Centralized Cons

  • Usually a high time delay for availability of documents
  • High cost due to shipping of documents
  • High maintenance costs
  • High training costs to bring on new operators
  • Disaster recovery planning issues if centralized site is down
  • Operators are typically not knowledgeable in the documents they are indexing

Decentralized

Over time, as bandwidth and scanning hardware/software prices went down, the obvious move was to decentralize the whole scanning and capture process.  This move placed scanning in the branches, and allowed the whole document capture process to be performed by those who had working knowledge of the documents.  Smaller, desktop class hardware could be used, and most capture companies made batch scanning and upload to the centralized repository simple to accomplish.

Decentralized Pros

  • Scan operators are well versed in the documents they scan
  • Documents are available almost immediately
  • No shipping or transfer costs for documents
  • Branch control of the whole scanning process

Decentralized Cons

  • Standardization can be an issue
  • No centralized control or reporting
  • WAN Bandwidth consumption can be high
  • Licensing costs can be high depending on software utilized

Distributed

The advance of network-based scanning devices and the lowering of bandwidth pricing led to the newest model, the Distributed Model.  Distributed Scanning allows for just about anyone in the organization to walk up to a network scanning device/scanning copier/fax machine and send documents to a repository.  The devices are typically multi-faceted, and along with repository integration, can provide scan to network folder, FTP and email.  Collaborative back-end systems, like Microsoft SharePoint, lend themselves nicely to this model, as they allow anyone to participate in a Document Workspace.

Distributed Pros

  • Put scanning in the hands of everyone in the organization
  • Provides a great launching pad for collaborative solutions
  • Simple, easy to use interfaces allow for minimal training and quick adoption
  • Capture and indexing is now in the hands of the true document owner
  • One-to-many solution provides a single device to service many users

Distributed Cons

  • Lack of standardization without software addition
  • Security and document control can be major issues
  • Bandwidth from smaller branches can be a problem with larger scans
  • Lack of hardware integrations with back-end systems

So, most organizations today are combining the above models to create a Hybrid Scanning and Capture solution, and leveraging all the strengths together to minimize the weaknesses of any one model.   Another strategy is to tie scanning models to specific business processes, as most lend themselves nicely to specific scanning and capture workflows.

Hardware and Choosing Your Scanning Model

 

Most organizations will choose their model to leverage their existing hardware investment, but this can be lead to decisions that seem good at the time, but if deeper examination occurs, it can make sense to realign hardware with the best model.  Take for example, a company that instantly leans toward a distributed model, and attempts to leverage their copier fleet that is currently under lease.  If you examine the part of this guide that covers scanning hardware, copiers will not always fit for the type of scanning you need to perform.  Take for example a branch accounting department that is looking to scan receipts or check stubs.  Will the copier perform well with mixed original sizes?  Just a word of caution to examine the paper, workflow, and document types to get the best feel and adapt the best model.

A Little more on Scanning and Capture Models

So, in examining a corporate strategy on how best to deploy a scanning and capture solution for SharePoint, there are typically 3 models:

  • Centralized
  • De-centralized
  • Distributed

Each model has its own pros/cons, and below I will examine each, and dive into some detail.

 

Centralized

Ah, the centralized model.  Some call this old school scanning and capture, as for many years, this was the only way to get the job done, and convert your paper to digital form.  This model provides a centralized scanning center to provide mass conversion for the organization.  The operation can be run by in house personnel, be managed by a services provider in house, or be outsourced to a scanning service bureau.  It requires high volume/high speed hardware, and typically utilizes advanced capture software to allow for the utmost in automation and efficiency.  The software and hardware operators are typically highly trained, and there are usually only a few of them.  Paper and/or digital media is shipped to the centralized location and processed through a set, standardized capture workflow.

Centralized Pros

  • Easily standardized process due to a limited number of skilled/trained scan operators
  • High speed hardware/software results in minimal processing time once paper is received
  • Centralized reporting and control of overall process
  • No loading on WAN infrastructure
  • Centralized backup and restore

Centralized Cons

  • Usually a high time delay for availability of documents
  • High cost due to shipping of documents
  • High maintenance costs
  • High training costs to bring on new operators
  • Disaster recovery planning issues if centralized site is down
  • Operators are typically not knowledgeable in the documents they are indexing

 

Decentralized

Over time, as bandwidth and scanning hardware/software prices went down, the obvious move was to decentralize the whole scanning and capture process.  This move placed scanning in the branches, and allowed the whole document capture process to be performed by those who had working knowledge of the documents.  Smaller, desktop class hardware could be used, and most capture companies made batch scanning and upload to the centralized repository simple to accomplish.

Decentralized Pros

  • Scan operators are well versed in the documents they scan
  • Documents are available almost immediately
  • No shipping or transfer costs for documents
  • Branch control of the whole scanning process

 

Decentralized Cons

  • Standardization can be an issue
  • No centralized control or reporting
  • WAN Bandwidth consumption can be high
  • Licensing costs can be high depending on software utilized

 

Distributed

The advance of network-based scanning devices and the lowering of bandwidth pricing led to the newest model, the Distributed Model.  Distributed Scanning allows for just about anyone in the organization to walk up to a network scanning device/scanning copier/fax machine and send documents to a repository.  The devices are typically multi-faceted, and along with repository integration, can provide scan to network folder, FTP and email.  Collaborative back-end systems, like Microsoft SharePoint, lend themselves nicely to this model, as they allow anyone to participate in a Document Workspace.

Distributed Pros

  • Put scanning in the hands of everyone in the organization
  • Provides a great launching pad for collaborative solutions
  • Simple, easy to use interfaces allow for minimal training and quick adoption
  • Capture and indexing is now in the hands of the true document owner
  • One-to-many solution provides a single device to service many users

Distributed Cons

  • Lack of standardization without software addition
  • Security and document control can be major issues
  • Bandwidth from smaller branches can be a problem with larger scans
  • Lack of hardware integrations with back-end systems

So, most organizations today are combining the above models to create a Hybrid Scanning and Capture solution, and leveraging all the strengths together to minimize the weaknesses of any one model.   Another strategy is to tie scanning models to specific business processes, as most lend themselves nicely to specific scanning and capture workflows.

For more information, view a webinar on Distributed Scanning and Capture at the link below:

Distributed Scanning and Capture Webinar

 

What scanning and capture model should you choose?

Model, what the heck does that mean?

In traditional scanning and capture, there are 3 well recognized scanning models: centralized, decentralized and distributed.  Below I will cover each in detail:

  • Centralized – Ah, centralized…the old school method.  Imagine a room with ten blue hairs, feeding big iron scanners, and the hum of paper over rollers filling the air.  This is the traditional scanning model, where paper is shipped to a centralized location, and a few highly trained operators with high speed scanners capture and process paper.  This process is easily standardized, but usually the operators are not the knowledge workers that know most about the documents.
  • De-centralized – As bandwidth got cheaper, companies began to look for ways to put the scanning task into the hands of the end users.  The decentralized model provides branch level scanning, usually with smaller desktop hardware, and gives more control to the knowledge workers.  Things get scanned more quickly, and the indexing process is less error prone.
  • Distributed – with the advent of network connected scanners, copiers and fax machines, distributed scanning has evolved to be the model of choice for SharePoint.  It puts the scanning and capture task in the hands of everyone in the organization.  It does have some drawbacks though:  usually you need some software to standardize and govern the whole process, security becomes an issue with scanner availability, and most manufacturers have limited integration options for ECM.

Typically, a SharePoint Scanning and Capture environment requires some type of Hybrid Solution that can be a mesh of all models.  Beware, you will need a capture application that can prosper in all different types of environments.

 

AIIM Capture and Business Process Survey

Here are some great bullets from the latest AIIM Survey:

  • The strongest driver for scanning and capture is improved searchability and knowledge sharing across thebusiness, followed by productivity improvements, reduced office costs and better customer service.
  • 58% of SharePoint users are not storing scanned image files and only 9% are executing any workflow or BPMwith scanned images. File sizes and the ability to handle scanned image throughput are the biggest concerns.
  • 39% of responding organizations reach positive payback on their investments in scanning, capture and BPMwithin 12 months, rising to 60% within 18 months. Automatic document classification shows a particularly highreturn for the 19% of respondents utilizing it.
  • 60% of respondents have one or more capture and BPM systems. Of these, 39% have a single system in usefor all applications. Of those with multiple systems, 80% are looking to converge to a single system.
  • Although respondents expressed a preference to source workflow and BPM as part of an ECM suite or as partof SharePoint, the decision maker for capture and BPM is likely to be a department or Line of Business head,compared to a Head of IT or Head of Compliance for the ECM system.

http://www.aiim.org/pdfdocuments/IW_Capture-and-BPM_2010.pdf

Document Capture Planning for Search Results

Taking up the chore of scanning paper documents into SharePoint is not as simple as it may seem. I had a call with a prospect the other day that oversimplified the task, and had the mindset “we buy a scanner and click go”. I would argue that the most important piece of capturing documents and sending them to SharePoint is planning how you will search for them. So, in your planning, ask the following:

  • What columns are necessary and how will I gather the index data from documents?
  • Is full text OCR possible?
  • What fields can be gleaned during the capture process and which can be populated later?

In the planning process, it is absolutely imperative to plan for  the utmost in search flexibility, and I almost always encourage OCR to a searchable PDF.  Why?  Full-text search is the insurance policy.  Say you have an audit or a legal issue, and you are looking for that needle in a haystack, which doesn’t happen to be a column.  Take the planning process seriously, as once the capture process begins, it is almost impossible to change all your existing documents.

Questions to ask before you start your SharePoint scanning, imaging or capture project

So you want to use Microsoft SharePoint as storage for scanned images? Take a quick breath and don’t charge in too fast, as there are many facets of this type of project that need to be considered.

What type of volume are you scanning on a daily basis?

  
You need to take a deep dive into departmental and end user needs, and really look at the volume of pages they need to image and capture. This brings up a point I discus on a daily basis: Do you want to scan or capture? You may read this and say, what in the world are you talking about, but here is an explanation below:
Let’s create a definition and define a feature set for scanning applications. A scanning application is just a means to take paper, and quickly and easily convert it from paper to digital form. They are well suited to environments with very basic needs, and what I call “onsie-twosie” scanning, or low volume environments. Their feature sets provide very basic functionality, and may allow the use of basic separation, and very basic integrations with SharePoint. The majority of scanning hardware vendors bundle these applications with their hardware, although there are vendors that have taken it to the next level, and provide enhanced scanning capabilities beyond the typical bundled software.
Document Capture software can be utilized for basic scanning needs, but takes you to a whole new level from a “capture” perspective. These applications typically have a number of ways to “slice and dice” documents, and really focus on efficiency, and minimizing the time required to scan, index and capture data. Capture software provides numerous ways to automatically populate columns, including barcode reading, database lookups, OCR, and data extraction. True capture applications provide integration with scanners, folders with images, SharePoint Web Dav folders, etc. Any organization that is serious about processing paper documents, and want to do it in the most efficient, standardized manner, should look seriously at advanced capture applications.
Capture applications are typically well suited to high volume situations or in situations where data can be extracted automatically. Scanning applications are suited for very simple operations, and usually suited to low volume.

What type of scanning device(s) are you going to utilize?

 
There are only a few applications out there that will provide you with the ability to scan from any type of device. Are you going to use network based scanning devices or direct connect scanners? Look into support in these specific areas:
• What type of drivers are supported? ISIS, TWAIN, and VRS should all be allowed.
• Can hot folder functionality provide the auto-import and processing of all different image types, PDF included? Hot folder functionality should span local, network and WebDav folders.
Beware of “panel” based applications. They are typically very static, and can provide a line at the MFP/Copier as people are entering information about their documents at the actual device.


What output format do you want in the SharePoint libraries?

 
Scanning and capture applications today provide a broad array of image output formats, but the standard seems to be PDF Image with Hidden Text. This provides an all in one container for the original image and the searchable text. Install the PDF iFilter, and you have a searchable content store. There are some specialized usages that may require other formats. For instance, if you are importing JPEGs with EXIF tags with your advanced capture application, you will want to keep the original JPEG file with tags intact rather than performing a conversion.


What Scanning and Capture features will be necessary in your environment?


What features should you look for? This is the most difficult question of them all, and you really need to find an application that has a broad and expansive feature set to make sure you can cover today’s needs, and the needs of your organization in the future. This BLOG post is a great place to start:
Trends in Scanning and Capture




How much storage space will I require? Where are you going to store your images?


Just a few stats here to get you on your way:
• The standard scanned page can be estimated at 50K in size (at 300DPI)
• A file cabinet contains between 10,000 and 12,000 pages
This can give you a quick idea of how much storage will be required, and let you do some growth estimation over time.
You should also use these numbers to see if you should use the SharePoint DB for content storage, or utilize Remote BLOB Storage (RBS). SharePoint 2010 with SQL 2008 R2 allows this without the need for additional software through the FILESTREAM provider.


How will I view images once they are in SharePoint?


Without a viewer add-on, SharePoint will require you to open an image to view pages. This can be problematic if you are serving up large image files. Definitely take a look at some of the image viewer add ons to SharePoint. My favorite, VizitSP SharePoint Viewer, provides the ability to view/preview, annotate, image process, search (column based and full text) and have multiple images open in a tabbed view. This is an absolute necessity if you are going to give end users the best experience possible.

Just some questions to get the gears turning and make sure you get all the pieces to the puzzle.

SharePoint 2010 and Document Sets

So many good posts coming out on the web for 2010. Working to figure out all the angles on how to improve SharePoint as an imaging, scanning and capture platform. Document sets seem to be a great focal point. Great article outlining them and how to use:

Document Sets and SharePoint 2010

PSIGEN and Atalasoft Create SharePoint ECM Bundle

LAS VEGAS, Nevada, October 19, 2009 – PSIGEN Software, Inc., the leader in advanced capture and scanning solutions for Microsoft SharePoint, and Atalasoft, Inc., the makers of Vizit SP – the SharePoint Document Viewer today announced the release of a SharePoint Document Management bundle at the Microsoft SharePoint Conference in Las Vegas.

The partners have bundled their technologies to offer an end-to-end Document Management/Document Imaging solution for all organization sizes and types adapting easily to both Windows SharePoint Services (WSS) and Microsoft Office SharePoint Server (MOSS) environments. The combination of advanced capture and viewing technology will provide an affordable, feature rich option to a market with few alternatives.

Microsoft SharePoint Server is a powerful collaboration platform currently experiencing mass adoption across the enterprise, primarily for sharing office documents, replacing network file shares, document management, and as an intranet portal. Atalasoft’s and PSIGEN’s combined solution enables these organizations to realize additional benefits of SharePoint by providing cost effective document image capture, viewing, search and more advanced document management capabilities that leverages existing SharePoint investments.

“In today’s world, being able to put together the best of breed products like Atalasoft’s Vizit SP and PSI:Capture to create a seamless solution, will provide the on ramp to SharePoint that users and IT departments have been demanding”, said Bruce Hensley, President of PSIGEN.

“This combination of advanced capture and document viewing & management with Microsoft SharePoint is the ideal ECM solution enabling businesses to easily and cost effectively automate their paper-based processes,” said William Bither, President & CEO of Atalasoft.

The package is available through both PSIGEN’s and Atalasoft’s vast and growing Reseller Network.  The partners will be presenting their bundled technology offering at Microsoft’s SharePoint Conference is Las Vegas, October 18-22.

About PSIGEN

PSIGEN Software is the innovative leader in advanced capture and document management solutions. For more than 14 years, PSIGEN has provided software to improve all the processes around the conversion of paper to digital documents. The solutions focus on cost reduction, flexibility, standardization, and improved efficiency.   PSIGEN delivers these solutions through a network of resellers and distributors in the US and abroad. For more information, visit www.psigen.com.

“PSIGEN” is a registered trademark in the US, the EU and other countries. All other trademarks and registered trademarks belong to their respective owners.

CONTACT: PSIGEN Sales and Marketing: Stephen Boals, Vice President of Sales, 949-916-7700 x230.

About Atalasoft

Atalasoft builds software that improves how users interact with documents over the web through innovative image viewing, annotating, and processing technology. Products include DotImage, the leading imaging toolkit for .NET developers, and Vizit SP, the Document Image Viewer for SharePoint. Atalasoft is listed as one of Inc. Magazine’s 100 fastest-growing software companies and is a member of many industry organizations including AIIM, ARMA, and is a member of the TWAIN working group.

Founded almost a decade ago, Atalasoft’s products power over 1000 document management, ECM, and EMR applications built by ISVs, System Integrators, and Enterprises deployed to millions of end-users worldwide. Industries that Atalasoft serves include Healthcare, Financial Services, Legal, Government, Education, and Manufacturing.  For more information, visit www.vizitsp.com or www.atalasoft.com.

CONTACT: Atalasoft Sales and Marketing: John Casanova, Director of Business Development, 866-568-0129 x711 or at john.casanova@atalasoft.com.

###

PSIGEN and Neudesic Partner to Provide Customized SharePoint Document Imaging Solutions

IRVINE, Calif., September  14, 2009 – PSIGEN Software, Inc., the leader in advanced capture and scanning solutions for Microsoft SharePoint, today announced the formation of a partnership with Neudesic, a Microsoft National Systems Integrator and Gold Certified Partner.

The partnership was created to offer customized Document Management/Document Imaging solutions for all organization sizes and types, providing a structured offering with SharePoint as the back end repository.  The combination of PSIGEN’s advanced capture technology and Neudesic’s Microsoft product expertise will be unmatched in the industry.

” We are extremely excited about this partnership, as we believe what the SharePoint market has been missing is highly functional, packaged solutions  that don’t create a never ending construction zone.” said Bruce Hensley, President of PSIGEN Software, “Together we provide expertise in diverse areas, which when put together, we believe will significantly increase the ROI on SharePoint Imaging Projects””

“PSIGEN provides a fantastic document imaging and management solution for SharePoint that aligns perfectly with our goal of delivering the most effective solutions in the shortest possible time,” said Parsa Rohani, CEO of Neudesic. “We’re excited to partner with PSIGEN and look forward to bringing our clients unmatched capability and value across a broad spectrum of SharePoint solutions.”

About PSIGEN

PSIGEN Software is the innovative leader in advanced capture and document management solutions. For more than 14 years, PSIGEN has provided software to improve all the processes around the conversion of paper to digital documents. The solutions focus on cost reduction, flexibility, standardization, and improved efficiency.   PSIGEN delivers these solutions through a network of resellers and distributors in the US and abroad. For more information, visit www.psigen.com.

“PSIGEN” is a registered trademark in the US, the EU and other countries. All other trademarks and registered trademarks belong to their respective owners.

CONTACT: PSIGEN Sales and Marketing: Stephen Boals, Vice President of Sales, 949-916-7700 x230.

About Neudesic

Neudesic is a Microsoft National Systems Integrator and Gold Certified Partner with a proven track record of providing reliable, effective solutions based on Microsoft’s technology platform.  Neudesic’s technical and industry expertise empowers enterprises to enhance their technological capacity and respond to business opportunities with a greater level of efficiency.  Neudesic was established in 2001 and is headquartered in Irvine, California.  Neudesic offers its products and services nationwide with offices located throughout the United States, and a global presence based out of Hyderabad, India.  For more information about Neudesic’s products and services, call (800) 805-1805 or visit our website at www.neudesic.com.

CONTACT: Neudesic Marketing Director, Melissa Ward, 949-754-4524.

###