The Audit

The Future of Digital Storage: From Hard Drives to DNA

September 18, 2023 IT Audit Labs
The Audit
The Future of Digital Storage: From Hard Drives to DNA
Show Notes Transcript Chapter Markers

The Audit - Episode 26 - Ready to decode the future of data storage technology? We guarantee that you'll be fascinated by our in-depth exploration into this rapidly evolving landscape. Together with our esteemed guest, Bill Harris, we probe into the intricacies of current storage mediums, such as hard disk drives, flash drives, and magnetic tapes, while also introducing you to emerging technologies like 5D, DNA, and molecular memory. 

How are companies managing their data storage amidst ever-shrinking IT budgets? How are advancements like heat-assisted magnetic recording and microwave magnetic recording redefining hard drive technology? Brace yourself, as we take you on a journey to decipher these challenges and discoveries, along with Harris, a pioneer in the field. The conversation gets even more exciting as we delve into futuristic concepts like holographic and DNA storage, both promising yet fraught with challenges worth discussing. 

But we don't stop there. As we dig deeper into the impact of increasing storage capacities, it's evident that a revolution in the way we use and perceive data is imminent. From holographic and 5D crystal storage to DNA storage, we ponder the implications and potential of these advancements on the future of technology. Tune in, let's explore this fascinating world of storage technology together!

Speaker 1:

This presentation by Bill Harris looks at the current state of storage technology and where it's likely to be in 2026 and beyond. We will take a look at advancements in legacy storage mediums like hard disk drives, flash drives and magnetic tape, and explore new mediums including 5D, dna and molecular memory. Between me, nick and Bill Harris as we take a look at this fascinating tech that will surely impact us all. Alright, hey, bill, how are you doing? Doing well?

Speaker 2:

how are you?

Speaker 1:

Good. So we're back today and we're talking about storage today, the future of storage. Bill, you had taken us on a journey here. We went pretty deep last time when we talked about quantum and before that we talked about the future of classical compute and how that was the lithographies Is it lithographies or lithograpies? Those are getting much smaller. Then we went on a deep dive into quantum and, Nick, I feel dumber after that one.

Speaker 3:

I hope I can leave today not feeling quite as dumb as I did the past two weeks. It'll be fun Today's going to be fun.

Speaker 2:

Today's going to be fun, I think so. A lot of interesting things happening and we won't get as deep into the physics as we did with quantum.

Speaker 1:

Are you going to reveal anything else about yourself? I think the first week whiskey drinker and bromie is that what it's called?

Speaker 2:

That's what you say. Absolutely sure, why not?

Speaker 1:

Are you denying the whiskey part or the bromie part?

Speaker 2:

I think I'm going to have to have more whiskey to get through the bromie part.

Speaker 1:

Oh, it's brony, I messed that up. What else do we get, nick?

Speaker 3:

I think that was a good overview. The brony, I think that one sticks the best. Oh, the cats too I think we discussed cats. I'm the only one with a million cats.

Speaker 1:

Nick, you're having a baby soon. How's that going to come into play with the cats? I?

Speaker 3:

don't know. I think we're going to have to get rid of some of these cats. We got too many. Yes, baby is close on the way, any tips Bill.

Speaker 2:

Any tips to get rid of cats? No, get rid of the baby. Open the door. No, I don't know about the baby yet. Good luck, man. It's going to be quite a journey.

Speaker 3:

The cats. I like it. Open the door First one, yep Nice so much fun it is fun. We're looking forward to it, but it's going to be a big change.

Speaker 1:

Well, awesome, all right. Well, let's get into it. Then, bill, we'll turn it over to you to take it away here.

Speaker 2:

All right, hello. This is Bill Harris with IT Audit Labs, and today I'll be talking about the future of data storage. Here's our agenda for today. We're going to start kind of broad and really start to get into some details as we move from current day into the future. I'm going to go over the data landscape as it exists today and I'm going to talk about what the industry response is to that, how the industry is compiling solutions to address today's challenges. Among that will be magnetic tape, magnetic hard drives, flash storage, holographics, and we're going to go all the way into DNA and molecular and then talk about really what it all means. Then we might expect to see some of those technologies become available.

Speaker 1:

That magnetic tape Bill and I've had to do a few restores reliant on tape and I can't say I've ever had a good tape experience.

Speaker 2:

That's part of the problem, but we'll talk about some of the upsides of that as well, I think.

Speaker 3:

I've built a smile. Well, we did.

Speaker 2:

So let's talk about the data landscape as it exists today. In today's terms, there's a little bit over about 100 zettabytes of data, so just a tremendous amount of data in the world today, and that's going to be doubling, probably by 2025. Data centers themselves are responsible for about 2.5% of carbon emissions. These things, these big spinning hard drives and these racks and racks of computer equipment, is getting a lot of attention as we seek to become more environmentally aware. So, as I referenced earlier, by 2025, we expect the amount of data that we generate each day to reach about 463 exabytes, that is, about almost about half of a zettabyte. So it's just growing by these huge amounts.

Speaker 1:

Bill, you used the term zettabyte and kind of like on the quantum conversation where there is different, was it quantum? Or I think the different names of the quantum computers were kind of getting a little bit out there, the names of the size of the byte, right when you get beyond megabyte and then what you get into gigabyte and then terabyte, petabyte, you're using terms like Zetabyte. Is that beyond? Yeah, terabyte.

Speaker 2:

So it goes from terabyte to petabyte, to exabyte, then to zeta bite, and then after zeta bite comes a yata bite, which is the most fun name of them all, because it sounds funny.

Speaker 1:

Is that the one that was based on Star Wars with the Yoda? It's like yoda bite.

Speaker 2:

I don't know. I seems it seems like the yacht, seems like that that's. That prefix would have existed before Star Wars, but something to look into for sure.

Speaker 1:

Nick, there's gonna be a quiz on this later.

Speaker 3:

I'm actively taking notes on looking up zeta bite. That's good.

Speaker 2:

Let me know, please. So let's put this into perspective now that we know kind of where zeta bite falls in the taxonomy of bites, if we, if you were to picture this, we're talking about 485 million hard drives every day. If a hard drive is a 1.2 terabyte drive and most people can envision that, if you've seen a hard drive inside your computer just imagine that many hard drives. That's what we're talking about when we're talking about 463 exabytes and bill that's what you're saying is generated Every day by 2025. That's what we expect.

Speaker 1:

What is generating that day like? What's the content?

Speaker 2:

Oh yeah, oh sure. So I mean, it's everyone's cat photos on Facebook, right? It's um, it's uh, you know all the reddits that you have out there. It's you know all its movies, right. So it's not only the movies that are hosted by the studios, but it's all the files that you might find on peer-to-peer networks, and it's all sitting out there. It's health information, it's, it's just everything, right, and so that kind of gives you a feel for just all that data that we're referencing and a lot of its duplicative way. A ton of this is just redundant data. It's not necessarily unique. So this is an enormous cost, right. We're talking approximately five trillion dollars per year to continue to To grow at this data rate. So you can see where we're going with this conversation today, where some of the challenges are, which is, we have to Reduce this data footprint first of all, right.

Speaker 3:

So we can't bill. Who's actually storing? Sorry to interrupt you. Do you know who's actually storing all this like? Who's the biggest player? Amazon, google, who's doing a majority of it?

Speaker 2:

Yeah, so, yes, you're gonna have. You're gonna have Amazon, microsoft, google, oracle, like the big clouds. For sure, you're gonna have governments right. So the United States government is a monstrous a store of data. Private sector as well, big healthcare companies, financial firms, etc. The residential space is, you know, adds up to a lot you know, interesting yeah, thanks.

Speaker 2:

Yeah, absolutely so. All of these companies are looking for ways to fit that same amount of data once you less and less Infrastructure right, because they don't really want to give up their data right. For them, their data means money, it means Information. For the military, data means having a leg up on your worldwide adversaries. In a business space, having all that data Means being able to provide additional value that your competitors can't provide right? So so they don't want to give all, give all that up. So we're seeing some really interesting technologies come out to compress that data, and a lot of these new technologies use things like artificial intelligence and machine learning to identify just duplicate blocks, but blocks that are kind of close to being similar, and Then the apod, these algorithms, to compress that and then decompress it when it's needed. So you'll see a lot of these early innovations at places like Microsoft and Facebook. That's kind of and that's kind of typical as these companies grow, as we're seeing, you know, google, microsoft, facebook coming out with a lot of these Technologies first. Now, as we go through this, you're gonna see also how all these technologies have a keen eye towards density. Not only do you need to shrink the data footprint, but they want that storage medium itself to be as small as possible, because smaller means usually means cheaper fewer materials. It's going to draw less power and produce less heat. Over time we will see the the chipping away of mechanical data assets and also those mechanical hard drives, robotic arms. You're gonna see that kind of that type of thing eventually start to go away as they introduce either solid state or Molecular or biological mediums.

Speaker 2:

So, eric, let's start. Let's start right here. So you mentioned tape earlier. So tape is actually got a long life ahead of it, right, oh? But here's, here's where tape is used today highly effectively. It's used in the archive space. So what? The image you're looking at here is a giant robotic tape library For reference. These things often stand about six or seven feet tall and they can be, you know, as much as 1520 feet long, or they can be a silo type of a type of an apparatus, and these libraries, these, these mechanical arms that you see over here, will just come out and grab the tape. We'll put it into the drive, read the data off, but the tape back.

Speaker 2:

Now. This is effective if you never take that tape out of the library, because where the problems start is as soon as that tape comes out of the library. You got to send it off-site and you got to track it and bring it back. It's a mess. So keep it inside of that library, which is temperature controlled and humidity controlled, and you've got yourself at highly, highly dense storage medium that is enormously cheap. All of the seismology that runs today runs one tape. It's stories, data off of tapes. So all these, like all the the the trackers out there that are Monitoring for earthquakes, store all that data to tape today. Anything you put into Amazon deep glacier it's going to take right. So it's effective if you follow these basic tenets of store it and leave it.

Speaker 1:

It's. Also it doesn't do well if you're overwriting it a bunch of times, as I recall tape does have an effective capacity, like an effective life to it.

Speaker 2:

So if you have to overwrite it like hundreds of times, then yeah, you're gonna wear the tape out. But here again, if you're writing archival storage, you're not trying to overwrite that. You're writing data and you're gonna keep it, you know.

Speaker 1:

So I agree with what you're saying when, when I was early in my career, I was working for a startup company and we were running exchange 2000 I believe at the time, and we were moving to a new building. So we're going into the new building, they're building out the office space and and they were Consolidating the, the data center in the office space. It was the equivalent of about three offices in size, that the data center and I walked in there and I had the responsibility of running it, but that meant very little. They really didn't listen anything I said, but I still had the title. So I walk in and I see that we've got our servers in this quote-unquote data center and they have some cloths draped over the racks and the workers were in there with drywall saws, like Manipulating the environment. I think they were gonna make a little bigger. I don't really recall exactly what they're doing, but I was like, wow, absolutely can't have this. This dust is gonna get everywhere. Drywall particles are very small. It's gonna be a huge problem. So of course I bring this up to management. And this was in like the the dot-com boom, where you know you could start up a company and Not really know what you're doing, but you could still run it. Well. Well, these jokers, I bring it up, hey, we can't do this. They're basically give me a blank look and they're doing it anyway.

Speaker 1:

So long story short, fast forward a couple of months. Hard drives start failing. And they started failing because, obviously, the the dust had gotten in there and Created a whole bunch of problems. Most of the time they were set up in either raid five or raid one and we were able to to replace the drives as we could. But then came that day where the exchange server went down. It was in a raid five Configuration and we lost more than one drive. I think we lost two drives, maybe three and I think there's maybe a total of seven in the in the whole thing, including some of the hot spares.

Speaker 1:

Long story short, we could not recover I the exchange data from those drives. We had sent them off to a recovery facility to try to get what they could off of it and we're trying to restore from tape in the meantime. But that tape restore did not go very well. As I mentioned, the company did not want to spend a lot of money on IT, so we were just going through the same 30 tapes, throwing them in this fireproof box and then somebody was carrying that thing home every week and the whole setup was a joke. But that's why I have these PSCD tape. Probably wasn't the tape itself, it was the oversight and the management of the tape and we should have been doing it a lot better than we were. But spending that IT dollar was not as important as spending the marketing dollar. And, long story short, we lost a bunch of email, couldn't get it back and that was the end of it.

Speaker 2:

Yeah, that sounds pretty horrific. First of all, sorry you had to go through that, but I'm not too surprised given the way that you described how the tape was maintained and transported. So, yeah, these mechanisms are really sealed mechanisms the tape lives inside, it doesn't come out and, as a result of that, it's clean. It's extremely power efficient, because you're not activating it until you're pulling data. It's really cheap. It's extraordinarily dense that tape's about the size of a hard drive and it stores 580 terabytes. So, wow. So it is still a very effective platform for storing data that you don't need speedy access to. But let's talk about what's happening in the hard drive world.

Speaker 2:

So hard drives, these mechanical drives that we're all used to seeing, have been around for a long, long time, gone through a lot of iterations, and the latest iterations are as follows so the ones that there's largely selling now are called hammer drives for heat-assisted magnetic recording, and this is just a mechanism where they heat up the platter before they put the bits onto the platter, and by doing this they can shrink the area, they can improve the aerial density of the platters and store more data.

Speaker 2:

Now, for those who aren't aware, in a hard drive you've got yourself these four to five, sometimes more, platters that are about the size, almost a little bit smaller than the size of the drive itself. They're all stacked, one on top of each other, and then between these platters there's a little mechanical arm with a read and write head and so the drive spins around anywhere from 7,200 RPM to as much as 15,000. And as the drive spins, that mechanical arm goes across that drive and reads all the data. So the smaller space you can put bits onto the platter, the more you can squeeze to that platter. If you're improving that density and you're squeezing more bits into the platter, that means that the the drive can. Actually it has to spend less for you to pick data up right. So if you're streaming off a lot of data that's kind of lined up perfectly, you can stream data off of that drive extremely quickly, sometimes outrunning solid-state disks.

Speaker 1:

Kind of like. It's like almost like a record player, right With the needle on a record player and just going around and getting that data off of there. I remember back at that same company not to digress on them again, but we had taken apart a couple of the drives and this was the era of the early IBM DeskStar drives, if you recall those but when we took one apart we saw that the actual platter itself was made out of glass and then they had sprayed some sort of a coating on top of the glass that I guess was the magnetic piece that stored the data. And I think this is you know we're talking. The drives were in the low gigabyte sizes. This was kind of early 2000s.

Speaker 2:

Yeah, that's right. So there's this magnetic coating that holds the magnetism of the bits being deposited to the platters. So then, after the hammer drives, comes the heated dot magnetic recording, which is an iteration on the same concept, doesn't require too much more explanation. And then, kind of because it's very similar to what I just talked about it just kind of gets that dot even smaller. And then they're also working on the microwave magnetic recording, which deposits the dots to the. I call them dots, but it deposits the bits to the platters using microwave technology. Again, it's just another effort to reduce and make it as dense as possible.

Speaker 1:

So would you say there's a heating element in the drive or, in some cases, a microwave unit in the drive itself.

Speaker 2:

Yes, like a very tiny mechanism that either heats that platter up or transmits that little tiny microwave to aid in recording.

Speaker 1:

So while Nick's heat up his veggie burger, he could do that in the data center.

Speaker 2:

So, as a result of these technologies, we do expect to see 30 terabyte drives by the end of this year. Now, right now, the biggest drive you can really buy is going to be, I think, about 22 terabytes. However, seagate has released the 30 terabyte drives for early testing to select customers, mostly in the cloud space, of course, because they tend to get them first. We should expect to see 100 terabyte drives by 2030 using these technologies, and what makes it so easy to adopt is because it is still a hard drive, just like any hard drive we use today, and so you plug it into Same form factor.

Speaker 2:

Yeah, same form factor, same SATA interface, so it's brilliant from that aspect. There's no barrier to adoption. You just shove it in there and away you go.

Speaker 1:

If they're getting smaller, the amount of data that you're putting in a really small space keeps shrinking. How do they deal with that? Like from an error correcting standpoint, like if there's a tiny bit of damage to the drive or a dust particle that's going to disrupt lots and lots of data, versus in the past, where it might not have been as big of a deal.

Speaker 2:

Yeah. So a lot of the drives have error correcting circuitry and telemetry built into them so that if it detects any issues with its medium, it will transport that data to a better area of the platter and then it'll cordon off that bad area so it doesn't write to it anymore. This is wild. So really, the challenge is here. This all sounds fantastic, but the challenge is that you still have yourself a spinning hard drive and so it is still slow, right by today's standards faster than a tape usually, but not always slower than solid-state disk. So what they're doing you probably go to Nam. What they're doing is they're addressing this with dual actuators. So on the dual actuator drives, all they do is they insert a second arm in there. Now you've got two arms on the drive reading and writing data to these four or five platters at the same time, and so that doubles your throughput, gets it up to about SSD speeds, but at the risk of introducing another mechanical device that can fail. I find it an interesting kind of waystop on the way to full solid-state, and I'm not sure how much that dual actuator device will really be adopted, but it's innovative.

Speaker 2:

So let's talk about NAND, because that's really kind of what NAND has supplanted so many hard drives today. Nand first of all. This is when I say NAND, I'm talking about the little storage cells that make up flash drives, solid-state disk, all kind of the same thing. This has been around since the 80s. It's old technology. It's just been perfected over the last two decades.

Speaker 2:

The way they're improving on this today is that they keep, they're going three-dimensional with these NAND layers, right? So each one of these little layers you see in this graphic are a bunch of these storage chips, which I keep calling NAND, and they're usually produced most of them are produced overseas and they're assembled, but now they're going up. So whereas before they would kind of go left and they would kind of go left and right with it, so now they're building these up and stacking them, and that is why they call this 3D NAND. So you may see that on a lot of hard drives today it's just that they're going up the chip. So this makes it really efficient from a space perspective. But it also kind of makes it pretty speedy because you can now use this third axis to retrieve data.

Speaker 2:

Nand is and solid state drives in NAND are power efficient because you're just kind of sipping electricity as opposed to running a motor and a hard drive. They don't produce a ton of heat and of course, it's a whole lot faster because it's just fetching the data completely electronically as opposed to mechanically. The downside with NAND is that it is still not especially dense. The biggest solid state drive you can find today that's built into that form factor, I think, is 100 terabytes. So what a lot of the buyers have done today in the enterprise space is they've departed from the hard drive and now they're building these separate solid state cards that fit into their devices, and in doing that they can reach higher levels of density and start providing multi-petabyte storage arrays.

Speaker 1:

Would you say this is the stuff that you see on the USB storage drives.

Speaker 2:

Yeah, USB storage drives use these NAND cells in them. Usually they're lower quality NAND cells, meaning that you can only write or rewrite to them so many times before they just give out. But yeah, same technology.

Speaker 1:

I was talking with somebody the other day and they mentioned that now Micro Center is giving away 64 gig drives for free. I think you have to get the coupon out of the paper or whatever it is. It used to be. It was like 64 megs and now they're up to 64 gigs already.

Speaker 3:

So I was thinking about the floppy disks, the floppy drives. I remember having to bring those to school, to elementary school, to give to your IT teacher. That's how you would put little things to bring home on and just small amounts of data go on there. It's just crazy what we've got now in your pocket.

Speaker 2:

Yeah, those little film drives are very cheap. They're very effective. You might go to a vendor meeting and they're going to want to. They used to give you their marketing collateral on these thumb drives as you walk out the door. It's just very cheap to produce and when you're done with them you can just throw them away.

Speaker 1:

I'll give you that in a virus at the same time. There you go.

Speaker 2:

Now it's just all through email and websites.

Speaker 1:

Nick likes to do a thing where he drops the for a pen test, drops the USB drives in the parking lot. It's amazing.

Speaker 3:

I'm just saying you put a little the USB drive in the wall in the Mason you know, you cement around it. You get to leave a little little N sticking out and you plug your computer right into the wall. See if somebody will do that, people will do it, they'll do it. Yeah, that's crazy, it's a thing.

Speaker 2:

It's nuts. It's like. It's like taking out. It's like. It's like it's about as clean as, like you know, taking a lollipop you find on the street and just popping in your mouth. It's like why would you?

Speaker 3:

do that. It's like running into a burning building. You're right. There's no good way to come out of this.

Speaker 1:

No, nick do you think if we put a few of these USB drives around that bill some respect, some are pictures we could get people to plug them in and take a look?

Speaker 3:

Without a doubt Without a doubt Somewhere.

Speaker 2:

Pictures. That's good. So we're going to move on to some of the some of the more futuristic stuff. Everything I just talked about is available today. Now these next few items are going to be stuff that are in development. I'm going to kick it off with holographics, because holographics is something that got started over 10 years ago and it never quite really went anywhere, and I'll talk about why that happens and what's coming next.

Speaker 2:

So, to introduce us to this concept, a holographic storage is. The way it works is that it records images to the same area of this medium, usually like a medium, like a DVD type of medium or something similar, using different angled light, and they create a hologram out of it. So I think it's fairly easy to think about. If you think about hologram, it's like OK, that makes sense. This was originally thought of back in the 60s, so it's nothing super new. It promised very highly parallel throughput because you could read that data back at multiple angles, just like you wrote it, and so you can parallelize the throughput, and it's also very dense, so everyone in this industry that I've been looking at recently likes to measure things in terms of sugar cubes. This will probably come up again today in this presentation. But you can fit a terabyte into a sugar cube. Sure, that's pretty good. That's pretty dense.

Speaker 2:

The challenge is to this though and this is one of the reasons I think it didn't quite make it is it's not rewritable. So you burn that hologram into that medium. That's where it lives forever Super durable, it's going to last thousands of years, but you can't go over and redo it. That's good for archival storage, it is. It's good for archival, but I think the other reason it didn't make it is that, although it's sort of dense-ish, it's not quite dense enough. I think the optics that you would require to write to that, as I say, the juice, wasn't quite worth a squeeze. So this is now on hiatus. The technology is there, but it's looking increasingly unlikely that this will really ever make it to mass market.

Speaker 2:

So a child to holographic storage that holds a little bit more promise that they're playing with right now is our 5B crystals. So this uses a very, very fast laser to etch into a specialized glass. The laser is measured in a femtosecond, so a femtosecond is one quadrillionth of a second. It's super, super quick, and so what it does is it stabs this little dot into this specialized glass and that dot then represents a bit of data that can be read back later, and it can do this all over this little platter and this is what you're looking at here. These little lines are all little tiny dots.

Speaker 2:

This was introduced closer to the turn of the century and it is even denser than the holographic storage that we talked about. So you're looking at 500 terabytes on a 12 centimeter disk, so it's pretty dense and it lasts billions of years. So this is a fantastic solution for archival. But wow, is it slow. So reading all that information back it does take you back to the modem years. It can send information back at around 28,000 bits per second, which is for today's information load is probably not sufficient. So either they're going to have to improve the throughput on this or it'll probably just fall by the wayside.

Speaker 1:

Why is it called 5D versus 3D?

Speaker 2:

I don't really know. I want to say probably because it sounds a lot cooler, but I'm not certain.

Speaker 3:

Yeah, and I guess Bill too, with this. Compared to tape, is this more cost effective? Is it much more expensive or?

Speaker 2:

So at the moment, terabyte for terabyte, it is much more expensive than tape Because of the specialized materials required, which isn't to say that they can't make it cheaper than tape in time, but they're really going to have to pick up that transfer speed because that's probably going to be at the old breaker. So this one holds more promise. I tell you, dna storage has really been getting a lot of attention over the last decade or so. A lot of companies have really invested in it. Microsoft is a huge investor in DNA storage. They've got R&D lab devoted to it. They're developing devices to make this work better. Other companies too. Microsoft really stands out, though.

Speaker 2:

So DNA storage is, as the name implies, the ability to store information on synthetic DNA. Now, this is the same type of DNA that we think of, which uses these four nucleotide bases, and so it puts this data one here in a very dense fashion. So 215 petabytes for one gram. So a penny weighs about two grams. So you can fit almost half an exabyte onto the size of a penny Absolutely huge, in fact, it is surmised that we could fit all of the world's information in DNA within the size of a refrigerator. So it's stunning how much this thing can actually do. It lasts millions of years. This was if you've seen the shows, so I can call out Jurassic Park, and sure there's a lot of fiction there, but the basic premises sound that scientists have dug up DNA that has been thousands or millions of years old and they've been able to reconstruct some of it and read some of it back.

Speaker 1:

Do you think the Russians and the Chinese are cloning dinosaurs?

Speaker 3:

I don't think so.

Speaker 2:

There's a pile of laughs. We got them. Yeah, nice one. Yeah, that'd be fun. That'd be a whole other presentation. If we're doing that, I'll come back for that one. Yeah, it's some of the other benefits of DNA. As I mentioned on the slide, it's low power right, because once you put the data there, you're good, it's there, you don't have to do anything else to it. It just kind of stays as it is. It is extremely expensive, so right now to store and read back a petabyte of DNA is about $1 trillion, so not practical. But companies like Microsoft and others are building the apparatus to make that a whole lot more affordable. Part of the reason it's so expensive is because DNA is a biological. It's a wet process. You have to have lab technicians with pipettes and test tubes transferring stuff from one container to another, and it hasn't really been mechanized fully yet.

Speaker 3:

That seems crazy that it could be that expensive. But we could keep all the world's data in a refrigerator.

Speaker 2:

Yes, that's right. That's crazy, yep. And that's just for one petabyte. So to put all the world's data into a refrigerator, there isn't the money in the world to do that, right. And the other bad part about DNA is that it is slow Because of what I just mentioned. It's not mechanized yet, and so manipulating all that stuff it just takes way too much time. Eric, your question. Let's talk a bit about how it works.

Speaker 2:

So when you the DNA, first you have to translate it to binary code, right. So you're going to start up here. If you're trying to write information on a computer, you're going to turn zeros and ones into ATCGs. Now, dna speak. I am not a geneticist, right, but you might recall from chemistry class, it's adenine, cytosine, guinean and thymine, so those are the nucleotide bases that you have to work with. So you convert those zeros and ones into those letters, atcg, and they have four things to work with. Then you take that to a DNA printer. They make these things, and the DNA printer will then print out a synthetic strand of DNA into ATCGs. Then you take that synthetic strand of DNA and you store it. Again, you're talking about lab techs, right, with droppers and stuff, and so you have to store that into a vial or a test tube, and then you can store that.

Speaker 3:

They use in Jurassic Park as the shaving cream bottle, right, wasn't it? Yeah, they use in the shaving cream bottle.

Speaker 2:

Yeah, probably the best place to store it. Absolutely, keep it safe from the wandering T-Rex Are the Russians and Chinese.

Speaker 3:

That's right, yeah.

Speaker 2:

So when you go to read it back then you have to read it with a sequencer. So you read out the sequences of ATCGs that are in that storage. You put that into the computer. The computer then converts those ATCGs back into zeros and ones, and now you got your data back.

Speaker 3:

Yep, it was a barbissol, can there you?

Speaker 1:

go there, you go Barbissol. Nick doesn't use barbissol, though, as you can see.

Speaker 3:

I'm against it. Think of all the money you save. All natural yeah.

Speaker 2:

Yep. So yeah, dna is probably the one that's being worked on the hardest because it's going to be around forever. So long after we lose interest in maintaining the infrastructure for reading mechanical hard drives and tapes, we will have infrastructure for reading DNA, because we'll never stop sequencing DNA, since we're made of this stuff. So in terms of not just data longevity, but in terms of technology longevity, it's got tremendous potential. There's a sibling to DNA RNA? No, but that's a good segue though. But no, this is molecular memory, right. So in so far as DNA stores data onto these nucleotide bases, which are molecules, molecular memory does the same thing, but it gets away from having to use those four bases. So now you can use different molecular structures to record that data and you're not confined to doing it within DNA.

Speaker 2:

The promises behind this one is you can get those same densities. However, the challenge is that in order to write to a molecule, you have to keep it still, and so, to keep a molecule still, you have to make it extremely cold, which requires liquid helium, because liquid nitrogen just won't even make it cold enough. So that's when you got to bring in your heavy cryogenics and you got to cool this stuff down to and you get into a whole problem that's facing quantum computing today, which is you're spending a lot, you're spending a lot of energy to make something very cold so that you can then do something with it. But they are looking into this right now, sort of a side of DNA, but I think, as you can see, it presents different challenges and my suspicion is that DNA might actually kind of make it out first, before this one does.

Speaker 3:

What is this one Bill? How does this one look for size? Is it similar to? Maybe you already said that. Is it similar?

Speaker 2:

to the DNA. Yeah, it is very similar to DNA, because the molecules that you choose to write to can be as small as you can write to Got it Cost for this the same as well. Probably not as bad, because the biggest cost for this would be the cryogenics, for the most part just cooling it down. But unlike DNA, you don't have to have a bunch of texts, it doesn't have to be a wet technology, I guess is what I'm trying to say. So it can be. It can just be a lot more practical.

Speaker 1:

Is the DNA just sitting in liquid in a test tube, or how do they store it?

Speaker 2:

Basically they store it. So if you've seen on TV how they put those little pipettes into those little dishes, that's basically how they're storing the DNA. Reading back from the DNA is so slow because you have to scan through all that material to find the information that you're looking for. So one of the things that they're doing to develop DNA is come up with clever methods for searching through that sheer amount of data in a lot more rapid fashion. So it's molecular. So, and that's about as far as we can really go, because that'll take us into probably the mid-2030s, a dozen years in the future from now.

Speaker 2:

So, to try to tie all this up, there are a few points to keep in mind. One is that the current technologies that we have today will be around for several more years. They've got a lot of life left in them. You're going to see tape, you're going to see drives, you're going to see those solid state disks. They're not done perfecting that yet and they are the bridge that will get us to the next level, which is, as we've discussed, something like DNA or molecular, and that is going to result in this sudden jump, this big quantum leap in capabilities, especially in terms of capacities, maybe in terms of speed if they can perfect that technology.

Speaker 2:

But this can't come too soon, because we do not have the technology today to really get us past 2030 with how we're writing information. If we try to persist, we're just going to basically drain the world of silicon. We can't mine the stuff fast enough, so we must move to something else, otherwise we're going to be in huge trouble. One way to keep an eye on this and keep an eye on how this is progressing is A do your own research, too, and dig into this more, but also keep an eye on what's happening at the cloud providers. They're the ones with the monstrous data centers and they are the ones who stand to benefit the most from implementing these technologies, as it will reduce their costs and also provide them an edge as they seek to court new customers.

Speaker 1:

Bill, do you have any stock tips on companies that are doing this DNA storage?

Speaker 2:

Aside from Microsoft, which would be one of them, for sure. No, none of the smaller companies. I'm not sure any of the smaller companies were working on it.

Speaker 1:

I was hoping for a good dollar stock. Nick, we get in early.

Speaker 3:

Anything that's off the wall. 2030, we're out. I keep thinking about this stuff too right now, bill, so we're looking. Going back to the tape, so bigger, small organizations they have this stuff or the tape technology for backup on site. Is there ever going to be a point where these technologies are also on site, or are they always going to be at a big data center, like we saw the Fujifilm tapes? Right, a big data center. Do we ever foresee us getting to the point where this is readily available for an organization to have it on location?

Speaker 2:

Yeah, I do. I think it's going to be readily available for an organization. The question I have will be whether they will want it, because for big companies it might just be easier for them to say we're just going to continue to transfer things over the wire to a hosting provider for other reasons. Yeah right, interesting.

Speaker 1:

As these storage mediums grow in capacity, meaning we're able to store more things in a smaller area, then doesn't that just allow us to create more data, as I think of 4K video going to 8K video and then 16K video or whatever it is. We're just going to keep expanding on our use of data and essentially stay lockstep with the developments in storage.

Speaker 2:

Yeah, it's a cyclical thing and I kind of wonder which one drives which. Yeah, I mean, I think these advancements in capacity are in response to our growth in technology and in some ways you can say our growth in technology is due to advancements in capacity.

Speaker 1:

Yeah, it's really cool stuff. Well, thanks for taking us through this, bill. You've been listening to the audit presented by IT Audit Labs. We are experts at assessing security, risk and compliance, while providing administrative and technical controls to improve our client's data security. Our threat assessments find the soft spots before the bad guys do, identifying likelihood and impact, while our security control assessments rank the level of maturity relative to the size of your organization.

The Future of Storage Technology
Hard Drive and NAND Technology Advancements
Holographics and DNA Storage for Data
The Impact of Increasing Storage Capacity