NIH Blueprint: The Human Connectome Project

Connectome In A Box

Order Connectome in a Box

The Human Connectome Project is gathering data on healthy adult brains at a scale and level of quality never before attempted. For example, we are scanning using accelerated multi-band fMRI techniques that can capture up to 1,200 frames in a single resting-state session.

The resulting dataset is, in a word, gargantuan. Our currently released data is close to sixty-four terabytes for 900+ subjects; we are prepared to release data for up to 1,200 subjects before the project is complete. We are committed to data sharing throughout the life of the project, which presents a number of challenges: How can this much data be moved across the world to anyone who wants it? Where will researchers store the data once they have it?

Connectome In A Box is our expedient solution for those that want large portions of the data: a series of hard drives prepopulated with open access image data made available by the HCP.  These hard drives can be ordered through the HCP and shipped anywhere in the world.

What's New With The 900 Subjects Release?

There have been several updates of note in April 2016 for current users of Connectome In A Box. Here are the highlights:

  • MR Data from all subjects includes MSM-All Registration. Our last data release only included the MSM-Sulc intersubject-registered data. Now, both MSM-Sulc and MSM-All data is available for all 900 subjects.

  • Newly-released data is fully compatible with the 500 Subjects data release.
  • We are distributing data on 8-TB drives. In addition to increasing the number of subjects from the 500-Subjects release, the total size of this data release has increased from 20 TB to 64 TB. This necessitates a move from 4-TB to 8-TB drives for data distribution.
  • New guidance on compliance with your IRB. All Connectome data administrators should read this notice on data usage and compliance with your institutions restrictions on human subject research.
  • Now taking preorders for supplemental data releases. The 900 Subjects data release is available for purchase now, and we are also taking preorders for the updated HCP Starter Kit, the 500 Subjects Supplement, and MEG Subjects data release. How do preorders work?

MEG Subjects Data Release: Now Taking Preorders

We will be releasing a dataset of all MEG and MR data for the 95 subjects who have MEG data. This data will be available for purchase soon. We are currently taking preorders for this dataset, for those who want to guarantee an early place in our delivery queue.

How do preorders work?

IMPORTANT:
Before you begin to use HCP Data, please review the set of available HCP Data Use Terms, and follow the steps to accept the terms that apply to your research.

What is Connectome in a Box?

When can I expect shipment?

We process orders in the order that they are received. Most standard orders will ship within 30 days of their initial order. However, a high order volume and low drive inventory may affect this turnaround time. We will display current inventory levels and order volume during the order process, prior to payment.

It's free, right?

Unfortunately, no. Researchers are required to pay for the drive as well as shipping and processing costs.  The HCP makes no profit from the sales of Connectome data.  We have worked to minimize the cost to the research community and will adjust pricing as market rates drop for high capacity drives.  Currently, the total cost to investigators for each drive of data is approximately $200, including shipping costs. (Non-US customers can expect to pay more.)

What if I want to "try before I buy"?

We highly recommend interested HCP data users visit ConnectomeDB and download packages of data from one of our pre-selected subject groups.

What formats does Connectome in a Box come in? Can I get it formatted for Mac? Linux?

As of July 2013, we are only shipping drives formatted for Linux (EXT3). Mac and Windows users may be able to use workarounds, but these are not guaranteed to work on all versions of these operating systems. Please see this updated note on drive formats for more information.

What happens when future data releases come out?

Each time a new dataset is released, interested users will be able to order the data on a new drive. Drive prices may change from release to release, based on our cost from suppliers.

What happens if a data set I already have is updated?

There are two classes of updates. In July 2013, we completely reprocessed all data, which made data on the original Q1 drives incompatible with newer data. In that instance, we strongly recommended replacing data with new drives. In two instances since then, we have released simple data patches users can apply to their current data to fix issues and replace missing files.

Please note: the data released in the 500 Subjects Release has been newly reprocessed, and should not be mixed with data from previous releases. (Data from the 900 Subjects Release is fully compatible with data from the 500 Subjects Release, and will be compatible with the upcoming final 1200 Subjects Release.)

This is an internal hard drive? What if I have a laptop?

We anticipate two kinds of users ordering these drives. One will be able to take the drive out of the box and give it to their IT person, who will transfer the data onto their network storage unit and make it broadly available to the investigators at their center.

The other will want to plug it right into whatever computer they use and get at the data directly. For this second class of users, we highly recommend purchasing the HCP Starter Kit, which is a single-drive dataset. To connect this drive directly to your machine, you will want to purchase a USB drive enclosure that supports 8TB hard drives, such as this one from VanTec.

How do I verify that the data on my drive is complete?

We strongly recommend verifying your data integrity using MD5 checksums that are included with each subject.

Every package file, on the Connectome in a Box drive or in a downloaded set of subject data, contains a series of checksum strings in a series of JSON files. These files are located in a hidden directory named ".xdlm" within each subject folder. (Example: 100307/.xdlm) Each checksum can be checked with md5sum to verify that the contents are the same in your version of the files as they were when they were created.

Can I "recycle" hard drives with old data on them?

We have canceled the HCP drive recycling program, as of the 500 Subjects Release. This decision was the result of a cost-benefit analysis that factored in the upsurge of drives that we need to manage for each shipment, the impact of maintaining such a program on our internal resources, and the relatively low usage of the program over the past 15 months.

What if I get a bad hard drive? Can I return it?

Prior to shipping, the accuracy and completeness of the data on each drive is digitally verified.  However, it’s possible that a drive may become defective after shipping.  If your drive stops working within 30 days of receipt, you may ship it back for a replacement at no additional cost.  Please contact orders@humanconnectome.org for more information.

How does the HCP create the drives?

The HCP informatics team has assembled a high throughput drive duplicator, referred to by its builders as the “duplicatinator.”  we use RSync to duplicate drives. Once the copies are completed, we use a checksum to verify data integrity.

How do I order?

Go to http://humanconnectome.org/data/data-request/ to get started.