User Tools

Site Tools


storage_management

Storage management

Storage is important in digital pathology. Image data can be subject to any of the 3 (or 5) Vs of data science:

  • Variety - Imaging data in pathology is generated during biopsies (macroscopic observations on the sectioning station), brightfield microscopy (high-resolution), immuno observations (multiple channels), and z-stacking.
  • Volume - The recorded images are large: think 100k x 50k pixels. Sometimes in 16-bit RGB color resolution. An individual slide can be anywhere between a 100 MB in size (a needle biopsy e.g.), or several GB in size (a solid tumor section samples scanned at 40X magnification)
  • Velocity - Data comes in rapidly, with 100s of slides being scanned on a daily basis. This poses challenges in terms of how much pre-treatment and time you can spent on any individual slides.

For these reasons it's important to have tile server solution that is flexible.

PMA.core supports the following storage media:

  • local hard disk (think of you conventional C: and D: drives and partitions)
  • network storage like SMB shares (must be accessible via UNC \\server\path\to\data routes)
  • S3-compliant cloud storage (Amazon AWS, Western Digital HGST, NetApp, Arvados, IBM…)
  • Microsoft Azure storage
  • FTP server (yup, that free FileZilla File Transfer Protocol server is still around and can be now put to new uses for digital pathology applications!)

Our tile server introduces root directories: virtual mounting points that can point to any of these types of storage, where you have your slides available.

Most importantly, you can configure your root-directories in a hybrid fashion, with some storage pointing to traditional hard disks, and other (perhaps long term) storage pointing to cloud resources.

This hybrid configuration model also means you can scale easily over time: you can start with a setup whereby your slides are mostly placed on a (big) local hard disk. After a while, you switch over to your organization's network storage. Even at a later stage, you can transparently migrate to S3-compliant cloud storage. When you have an external collaborator that temporarily wants to share their slide collection with you, you can ask them to setup an FTP server and patch a root-directory through to that one.

Root-directory resources can have authentication and impersonation information attached to them. In addition, PMA.core has its own access control lists to determine what user_groups and individual users can see and do (according to the CRUD principle).

A comprehensive blog article on the subject of storage and image management is provided at our blog.

storage_management.txt · Last modified: 2022/07/18 14:59 by antreas