Obelisk GRN1 Chip Details

#1

I wanted to come forward and share some of the details about the GRN1 chip that we are making. Obelisk is here to move the space forward, and a part of that is increased transparency around the ASIC design process so that the community can better understand what goes into an ASIC project and make more informed decisions around proof of work. The chip was co-architected between Obelisk and ePIC Blockchain, and ePIC Blockchain (www.epicblockchain.io) led the implementation of the chip.

The Obelisk GRN1 specs have been updated to 150 graphs per second at 800 watts. The first units are on track to begin shipping early October, and the final units are on track to ship in late October. Our specifications are based on a pessimistic interpretation of simulations that are generally +/-10% for speed and +/-30% for power.

Because of NDAs, I can’t share exact information about our design choices. But I can link to public information and speak broadly about the implications. The GRN1 chip is a single-die cuckatoo31 miner. The chip we made has a full 512 MiB of memory on board. Our technology partners have verified that we are within the margins of test, packaging and yield boundaries, and that the final product will be fully viable. Beyond substantially increased speed and efficiency, using a single chip also reduces manufacturing complexity and enables a more reliable final product.

We believe that the most efficient Cuckatoo32 miner is also a single-die chip using today’s technology. And we also believe that the most efficient Cuckatoo33 and Cuckatoo34 miners are single-die chips using today’s technology. We’ve done substantial investigation into the memory capabilities of modern foundries, but before I go further, I want to provide some basic statistics:

According to the above articles, the smallest SRAM cell at TSMC 16nm is 0.074 square micrometers. And at 7nm, the smallest SRAM cell at TSMC is 0.027 square micrometers. The largest chips have over 800 square millimeters of area. If you do the math, this means that a 16nm chip has a maximum theoretical memory size of about 1.3 GiB, and a 7nm chip has a maximum theoretical memory size of about 3.5 GiB.

In practice, you cannot create a 16nm chip with 1.3 GiB of memory. A chip is a lot more than just an array of bits, but in practice at 16nm there is more than enough room to do a full CC31 mining algorithm, including cycle finding. Beyond that, the foundry does have limits to how much memory they can support, and every piece of memory that you add impacts yields. Due to NDA’s, I’m unable to share the maximum amount of memory we believe we could put on a single 16nm die, but I am comfortable saying that it’s more than 512 MiB. Putting this much memory on a single die does impact yields, however the impact is small enough that our chip remains viable. As a process matures, yields improve substantially, and the TSMC 16nm process is quite mature at this point.

At 7nm, we believe that we could do all the way out to Cuckatoo33 without needing to make a 2x time-memory trade-off. To say that again, we are confident that you can make a single-die ASIC to do CC31, CC32, and CC33. We also believe that CC34 is possible, though at this point substantial time-memory trade-offs are required.

Our Cuckatoo31 chip has a very interesting property relative to typical single-die ASICs - the heat signature. A typical highly optimized Bitcoin mining chip produces between 0.3 and 0.5 watts per square millimeter. Because of this heat profile, a typical $1200 miner may cost $1200 per year to operate, primarily due to the cost of electricity. For an American mining on typical consumer electricity rates, that annual cost is more like $2400 per year, meaning that consumers really cannot afford to mine at home.

Our Cuckatoo31 chip has a heat signature that’s less than 0.1 watts per square millimeter. This translates to much lower electricity costs. The same $1200 spent on a mining device results in electricity costs that are closer to $400 per year for a typical mining farm, and $800 per year for a typical consumer. The total cost of ownership gap between a consumer and a professional farm is substantially lower for cuckatoo miners. This is a key distinguisher for the Cuckatoo31 algorithm and a way that Grin stands out.

There’s another very interesting aspect to Cuckatoo31 specifically. When optimizing over total cost of ownership, wafer pricing becomes a lot more important. The primary cost of the machine throughout its lifetime is not electricity, but silicon. 16nm silicon is cheaper and more accessible, which means that 16nm chips are more competitive, and depending on price, potentially even strictly superior.

This is fantastic for competition. The development and tooling costs for 7nm are far higher, and the 7nm technology is a lot more exclusive. These higher costs mean that there is much less room for competition. If 16nm chips are potentially superior, smaller companies can compete with lower initial investment and the competitive environment for Grin will be more vibrant.

At cuckatoo32, 16nm is no longer the ideal node, because the memory takes up too much space and doesn’t leave enough room for all the other elements of a grin miner. Even switching to cuckatoo32 means that new competitors have to be at a more advanced node, which restricts the competitive environment.

The phase-out also has another adverse effect. Manufacturers are forced to choose an algorithm to target. This complicates the game theory. Manufacturers have to choose a specific level to target, and since miners that target lower levels are more competitive (this is just the nature of the hardware and the cost structure), being able to support a higher level means you will not be competitive at the lower levels. This also harms the overall competitive environment, you want the economics and profit models to be as simple and low risk as possible to encourage competition.

I know this information is a lot different than what many people were expecting. Most manufacturers are not up-front about the nature of their hardware, but we really would like to see Grin succeed, and we would like to do so by collaborating with the Grin community and letting them know what’s going on before it happens. We’re hoping to open a dialog between manufacturer and community, and we’re hoping to give the community all the information it needs to make informed decisions about the future of the proof of work ecosystem that protects the consensus layer. At the end of the day, Obelisk wants to see Grin succeed and we believe that being open is the best way to empower the Grin community. We’re looking forward to your thoughts and discussion.

21 Likes
#2

Thanks for sharing that, David!

Needless to say, this throws quite a spanner in the works of Grin’s anti-single-chip-ASIC stance, as laid out in Scheduled PoW upgrades proposal, and as further discussed in Cuckatoo32 feasibility.

Your point about the very different heat density is well taken; that kind of invalidates my claim that single-chip cuckoo ASICs are not meaningfully different from SHA256 ones.

Perhaps I should admit defeat and accept the reality of single-chip ASICs. If what you say is true and none of the next few size upgrades forces a memory IO bottleneck, then we might be better off disabling future phaseouts from some point on.

This would invalidate our commitment to ASIC manufacturers, so any such change would ideally need to meet with their approval.

I welcome community discussion on the best way forward.

3 Likes
#3

I believe you are leading and paving the road for other ASIC makers to adopt the clarity and transparency you share openly with the GRIN community and how you have been doing with SIA Asics as well. Not only does it build miner (customer) loyalty and trust but it builds credibility and transparency that we can all appreciate and hope to expect to see others do as well. Sadly I believe the transparency can spoil many of us that will aim to seek this sort of information from other asic makers and not know if that will happen or not. As such it falls on GRIN as a community and coin to establish good ethics policy and guidelines which would be a positive step forward to guiding other makers how they should disclose specifics of production quantities and overall platform within what’s reasonable.

Cheers to Transparency and clear helpful Disclosure.

Here is the GRN1 with the updated specs realtime profitability stats:


Save the date 7/24/2019 - Miami Airport Convention Center - https://miningdisrupt.com - Crypto & Mining Conference and Expo - Join us!

4 Likes
#4

We found this to be true for CryptoNight as well, which also has a large ratio of memory-to-logic on the die. Many people repeat the mantra of “hash-per-watt” but if the miner’s lifetime is short, 12 to 18 months, then capex is more than lifetime opex, even for a Bitcoin miner.

1 Like
#5

For phase-out to accomplish its goals, it must prevent single-chip ASICs from being competitive, but not hurt network security by increasing graph search time of multi-chip ASICs.

Also, to avoid centralization, it should operate on autopilot, and not require continual tweaking via hard or soft fork.

As this post demonstrates, it is likely impossible to adequately anticipate future technical developments such that the phase-out parameter can be fixed in advance.

To add a little bit of color from my own experiences, one ASIC manufacturer I spoke with about the possibility of producing Grin ASICs demurred on the basis that they did not expect to be able to produce a CC31 ASIC significantly better than a GPU. Another was only comfortable producing a CC31 mean-miner.

Ideally, a large number of manufacturers would be able to produce CC ASICs. From my own experience, this is not the case, and phase out would likely exacerbate the situation.

The commitment to ASIC manufacturers permits delaying phase out of smaller graph sizes if better-than 1 GPS miners are not publicly available. It is stretching the language a bit, but I think that delaying phase out of smaller graph sizes indefinitely is not a gross violation of this commitment. Especially if no manufacturer has invested money to research or produce a CC32 ASIC. (Separate from investments in a CC31 ASIC that would be repurposed for an eventual CC32 ASIC.)

Additionally, my own experience was that forecasting mining profitability and investment was greatly complicated by the limited lifespan of Grin mining hardware under phase-out. CC31 ASICs, for example, only mine for something like 6 weight-adjusted months at best. This cuts off the long-tail of mining profits, and provides a very short period for earning a return on investment. Deterring investment thusly this works against the goal of securing the Grin network, and ensuring that GPU-powered 51% attacks are uneconomical.

2 Likes
#6

The goal of the commitment was to protect large CC ASIC investments. I believe that such investments have already been made for C32, based on our planned phase out of C31. In which case we shouldn’t undo the latter.

But it’s quite possible that no large investment has been made into C33+ ASIC design (we could ask manufacturers to speak up if they have). In absence of such investments, we can amend the commitment to indefinitely shelve the phase out of C32 and beyond. That would give us time to see how well the various current C31/C32 designs perform in practice, and whether freezing the PoW at C32+ would be preferable.

In theory, if we’re happy to have single chip ASICs, we could even go back to C30+ in 2021.

The commitment has served its purpose in making ASIC manufacturers confident to invest in the current crop of C31/C32 designs. Now perhaps it’s best to increase our options for the future health of the Grin mining market, by shelving the phaseout of C32 until further notice.

#7

Circa 1492

Columbus ”Hey, I think I am off on the bearing of the North Star…just by a couple of degrees”

His assistance ”Oh, good one Chris, glad you spotted that…Instead of the Bahamas, we could have landed in Greenland with our beach ball and flip-flops”

Columbus “How about just a 1 degree adjustment? …that would take us to Florida”

His assistance ”Nah, I hear it is crowded there during Spring Break”.

Columbus “Ok, let’s go make history, on we go with this one adjustment”.

It takes a brave man to charter a course. It takes a braver man to stay committed. The bravest

Of all, are the ones that can internalize & course-correct.

#8

First of all, thanks for sharing. We at Innosilicon also believe that a more open environment will help the ASIC companies gain trust from the blockchain community. In joining this effort, I want to give our view and progress on the development of Innosilicon Grin ASIC.

Overall, I believe the desperate move to pack a monstrous amount of SRAM (512MiB for Cuckatoo31 and for 1GiB for Cuckatoo 32) is the wrong approach against the evolution path envisioned by the core team. This will result in high risk and short-lived ASICs (about 4 months at most for Obelisk GRN1)which is why capex becomes more than opex during the lifetime of a miner. Innosilicon fully embraces the evolution path as planned and believes a multi-chips approach of one ASIC coupled with high speed DRAM results in less waste of ASICs and more stable and predictable mining.

Our Grin ASIC will be Cuckatoo31/32 compatible which gives the miner longevity and makes capex less than opex. More details about the chip will be announced in April. The multi-chips approach minimizes the investment on fixed purpose ASICs and allow the DRAM to be repurposed even after Cuckatoo 32 is phased out. This approach is not only more sustainable and greener but reduces risk capital investment by the miners. It also sets us apart from the other Crypto ASIC companies because Innosilicon has been working on high speed memory interface for over 12 years and holds the most advanced high-speed memory interface IPs.

To put 512MiB of SRAM in perspective, AMD’s latest Zen based server processor has up to 64MiB of L3 Cache. Intel’s most powerful Xeon E7-8891 server processor has 60MiB of Cache. Innosilicon has successfully put 160 MiB of SRAM on our ZCash A9 series using a process that is denser than 16nm. From our first-hand experience in embedding large SRAM we believe going for 512 MiB or over 1 GiB is a dead end. The yield loss will be exponentially higher. TSMC is touting the success of its 7nm process and cites yield of 32MiB (256 Mbit) SRAM at 76%. (https://www.eetasia.com/news/article/tsmc-unveils-plans-for-7-12-22nm-nodes ) Imagine what kind of yield you will be looking at for 512MiB and beyond. The bad yield will force one to resort to higher operating voltages which helps to improve yield but power consumption will suffer. Thus I also believe 800W grossly underestimates the power consumption of the final product.

Again, I think it is healthy that we are discussing this topic in a open forum and look forward to more open discussion.

#9

@asic_king, thank you for your perspective! I think it’s important to hear from a diversity of manufacturers.

Overall, I believe the desperate move to pack a monstrous amount of SRAM (512MiB for Cuckatoo31 and for 1GiB for Cuckatoo 32)…

I don’t think I would characterize is as a “desperate move”. It is an equally valid point in the design space.

…is the wrong approach against the evolution path envisioned by the core team. This will result in high risk and short-lived ASICs(about 4 months at most for Obelisk GRN1)

The surest way to reduce risk and increase ASIC lifespan would be to cancel phase-out entirely.

which is why capex becomes more than opex during the lifetime of a miner.

I think that it can be debated whether a high cost of ASICs vs other capital expenditures is desirable, due to the way that it aligns incentives between miners and the community. ASICs are a form of capital that is coin-specific, i.e. in this case can only be used to mine Grin. Other forms of capital and operational expenditures, such as datacenter real estate and buildings, provisioned power capacity, and electricity, can be repurposed to mine any other coin. Thus, a miner who has invested a large amount in ASICs will not be inclined to do protect their investment and refrain from doing anything that might hurt the coin.

Innosilicon fully embraces the evolution path as planned and believes a multi-chips approach of one ASIC coupled with high speed DRAM results in less waste of ASICs and more stable and predictable mining.

As I mentioned, the surest way to reduce ASIC waste and make mining more stable and predictable is to cancel phase-out.

Our Grin ASIC will be Cuckatoo31/32 compatible which gives the miner longevity and makes capex less than opex.More details about the chip will be announced in April. The multi-chips approach minimizes the investment on fixed purpose ASICs and allow the DRAM to be repurposed even after Cuckatoo 32 is phased out.

Can you share the cost, power consumption, and graphs-per-second of your upcoming miner? An important point in this discussion is whether or not multi-chip ASICs will be profitable in the face of single-chip ASICs, and without the specs of some multi-chip ASICs, this is something that we cannot evaluate.

I think that given these plans to produce a Cuckatoo32 miner, it might not be fair to cancel phase out entirely. However, halting it at CC32 still might be reasonable.

This approach is not only more sustainable and greener…

Again, cancelling phase-out is the most sustainable and green option.

but reduces risk capital investment by the miners. It also sets us apart from the other Crypto ASIC companies because Innosilicon has been working on high speed memory interface for over 12 years and holds the most advanced high-speed memory interface IPs.

Individual manufacturers possessing intellectual property that enables them to produce more competitive ASICs by monopolizing specific technologies is a net negative.

To put 512MiB of SRAM in perspective, AMD’s latest Zen based server processor has up to 64MiB of L3 Cache. Intel’s most powerful Xeon E7-8891 server processor has 60MiB of Cache. Innosilicon has successfully put 160 MiB of SRAM on our ZCash A9 series using a process that is denser than 16nm. From our first-hand experience in embedding large SRAM we believe going for 512 MiB or over 1 GiB is a dead end. The yield loss will be exponentially higher. TSMC is touting the success of its 7nm process and cites yield of 32MiB (256 Mbit) SRAM at 76%. (https://www.eetasia.com/news/article/tsmc-unveils-plans-for-7-12-22nm-nodes ) Imagine what kind of yield you will be looking at for 512MiB and beyond. The bad yield will force one to resort to higher operating voltages which helps to improve yield but power consumption will suffer. Thus I also believe 800W grossly underestimates the power consumption of the final product.

I don’t think that David has specified whether or not the GRN1 is using 16nm or 7nm, but it seems that 512MiB at 16nm is possible. For CC32, if the full 1024MiB is impossible to fit on a single chip, time-memory tradeoffs are possible to keep the design on a single chip with reasonable yields.

Edit: I also wanted to note that I’m not a current nor planned investor in Obelisk Grin ASICs.

2 Likes
#10

I wanna highlight the importance of fair play and free market competition for all for anyone who truly want to support the Grin PoW. The Grin core team had formally outlined a very clear goal in its PoW design and a phase-out plan to encourage the multi-chip evolution path using external commodity memories with many months of serious discussion before the final decision for Grin. Whoever wants to supply the Grin mining equipment should adhere closely to the announced PoW rules instead of attempting to tailor the PoW for one’s self interest. Whether using multi-chip strategy or single chips is vendor’s own technical choice. Afterall, you make the chip to support Grin, not the other way around.

We think multi-chip using external memory can yield very competitive results for both CC31/32 and beyond. We will announce more specifics in April about our design. On the other hand, people who attempt the single ASIC chip approach should follow the same evolution rule to meet GRIN PoW requirement. I just want to point out the many hazards they will be facing. One can try the time-memory tradeoffs to limit the SRAM size beyond 512MiB to support CC32, but performance penalties are very steep according to our analysis. I think these risks are becoming apparent over time and single chip design won’t be competitive comparing to multi-chip with external memory approach. You need to be responsible for your design approach and it’s not right to lobby the core team into delaying the roll-out of the phase-out plan. Innosilicon is very good in single-chip design but we don’t think single chip is the best approach for GRIN. We have come out to support the vision of the Grin team by providing more responsible and more robust solutions. We will demonstrate to the world that the multi-chip ASIC solution will work better and work longer. It fits well with the original vision of the team and will attract support from the mining community.

I also want to comment on the topic of memory interface monopoly since it is mentioned, which is funny. Commodity memory usage is nothing special nowadays. Innosilicon is a strong design company who masters high-speed memory IPs, but there are plenty of IP vendors in this space like Rambus, Synopsys and Cadence that offers similar IPs. Memory interface technology is widely accessible in the industry. Nothing prevents anyone from competing in the GRIN PoW mining field. Healthy competition on a level playing field encourages design excellence and innovation for Grin. All in all, if you are a GRIN supporter, you just need to play by the GRIN rule. I believe more ASIC companies trying to work on GRIN design will help it to GRIN to gain more popularity and Innosilicon is happy to be part of the GRIN family.

#12

Thanks for the detail and explaining every detail, as said by you this update will empower the Grin community. :+1:

#13

Whoever wants to supply the Grin mining equipment should adhere closely to the announced PoW rules instead of attempting to tailor the PoW for one’s self interest.

I don’t believe the issue is that anyone is trying to tailor the PoW to their own self interest. The issues, as I understand them are:

  • Phase-out cannot prevent single-chip ASICs, which was its original intention
  • Phase-out may result in periods where single-chip ASICs dominate, and periods where multi-chip ASICs dominate, which increases hashrate fluctuations
  • Phase-out makes mining operations less profitable, due to limited lifespan of miners
  • Phase-out makes planning investments unappealing, due to uncertainty of projecting lifespan of miners

We think multi-chip using external memory can yield very competitive results for both CC31/32 and beyond. We will announce more specifics in April about our design.

I would love to be proven wrong, but my assumption, until specifics are available, is that multi-chip designs will not be competitive due to increased latency.

I also want to comment on the topic of memory interface monopoly since it is mentioned, which is funny.

Earlier you said:

It also sets us apart from the other Crypto ASIC companies because Innosilicon has been working on high speed memory interface for over 12 years and holds the most advanced high-speed memory interface IPs.

In other words, Innosilicon is exploiting patent and copyright-granted monopoly power to gain an advantage over other manufacturers. Many manufacturers do this, so I don’t want to imply that Innosilicon is especially bad in this respect. However, it is not healthy for the coin if producing a competitive Grin ASIC requires acquiring large amounts of expensive IP in order to compete.

Commodity memory usage is nothing special nowadays.

This contradicts your statement that Innosilicon has a strong advantage here.

2 Likes
#14

Welcome to the community, and thanks for taking the time to comment! I have long wished that more manufacturers than Obelisk would get involved in the discussion and provide a more transparent view into the world of hardware, so that cryptocurrency communities can make more informed decisions, and also so that the historically adversarial relationship between hardware companies and cryptocurrency communities can become a more collaborative one.

This yield quote is on TSMC’s 7nm process from more than a year before the 7nm process was cleared for mass production. During process bringup yields are usually incredibly low - part of process bringup is improving the yields, and I believe this statistic was chosen to be published because many people thought 7nm was very far away at the time, and yields that good already meant that 7nm was getting very close.

Yields today for TSMC’s 7nm process are substantially better. But that’s also not the right comparison point anyway, because Obelisk is using TSMC 16nm, one of the most mature and highest yielding processes in the industry. So not only is it an unfair and misleading statistic to asses 7nm, it’s also unfair an misleading because it’s comparing 7nm statistics to a 16nm chip.

Here is a newer article. Though it doesn’t give specific yield statistics, it does mention AI chips that go up to 2 gigabits of SRAM, which is 256 MiB. Clearly, this volume of memory is not as unprecedented as Innosilicon is suggesting. https://www.chipestimate.com/The-New-Deep-Learning-Memory-Architectures-You-Should-Know-About/eSilicon/Technical-Article/2018/10/16

The Bitmain Ethereum miner has DRAMs on it. Those DRAMs cannot be repurposed by a consumer, if you were to repurpose those you would need a professional shop to do it, and you would also need special firmware, and probably a custom board to receive those DRAMs. Bitmain did it this way because it’s cheaper and more efficient to use non-repurposeable DRAM for specialty applications. Doing the repurposing would almost certainly be more expensive than just buying brand new DRAMs - in the case of the Bitmain Ethereum miner, that DRAM is not practically repurposable and nobody is going to try.

I had assumed that Innosilicon would be using a similar strategy to save on cost, power, and manufacturing complexity. Innosilicon: are you suggesting instead that your miner will have DRAM that is intended to be repurposed, and if so, how difficult will it be to repurpose and what application / hardware could it be repurposed for? Will consumers be able to repurpose the memory without professional help?

It may be widely accessible, but it’s also expensive. Certain manufacturers, namely Innosilicon, have this IP in-house and do not need to pay for it. For everyone else, it is an expense that increases the barrier to entry and makes competition more difficult. This goes against Grin’s goals of having a competitive ASIC ecosystem with many manufacturers.

Obelisk’s current roadmap includes both a CC31 miner in October, and a single-die CC32 miner in April. Similar to the performance difference between the Antminer S7 and the Antminer S9 (both TSMC 16nm products), we expect computational speed and efficiency to roughly double. Because CC32 is more computationally complex, these things cancel out. Our internal target for our CC32 miner is 200 graphs per second at 1000 watts.

The phase out creates a planned obsolescence for our customers buying the GRN1, and that harms the value of the GRN1. However, we believe that the total block reward for AF-CC31 still makes sense to move forward with production, provided that we are careful with total production volumes. At the current price for Grin, we would probably ramp back a bit from the originally planned 10,000 units, though we have not yet fully determined what is appropriate. It’s also difficult for us to determine production volumes when our competitor - who claims to have an earlier shipping date (summer vs. fall) - has not yet released an indication of what the specs are or how many machines they intend to produce.

Obelisk currently believes that we will be able to capture the CC31 and CC32 markets both regardless of the phase-out, however that requires charging our customers more than twice as much for NRE over the next 18 months. CC32 also requires moving away from 16nm in order to avoid TMTO, which means tape-out costs will be higher and barrier to entry will be more difficult for competitors. This is actually good for Obelisk, as it means less competition and higher margins, but I don’t think it’s good for the Grin network.

The original goal of the phase-out was to ensure that single-die ASICs could not be created for the Cuckatoo algorithm. I had protested this at the time, though because I didn’t have strong yield statistics and we hadn’t done the full work on analysis yet, I was unable to confidently assert that single-die ASICs would be viable at Cuckatoo31 and beyond. Obelisk has since shown this, and we are confident asserting that we do not believe multi-die ASICs will be competitive at any Cuckatoo before Cuckatoo35 thanks to the sheer amount of memory you can fit on a single die, and thanks to the fact that yields are in fact not as bad as many fear.

For those still doubting that Obelisk’s chip is viable, I have included a link to a third-party audit of our chip that we had done, both to assure ourselves that we were not making a mis-step, and to assure others that we are a world-class team that is staying within the limits of the technology that we are working with even when we are pushing the boundaries of what people thought was possible: https://pixeldrain.com/l/o3hi6WtF#item=0

I fully agree that Grin needs to protect large CC ASIC investments, and that changing the promise creates uncertainty and harms manufacturer willingness to participate. However, it is not clear to me that delaying the phase-out of Cuckatoo31 is actually damaging to anyone. Cuckatoo32 miners are already allowed on the Grin network today, and in fact get to mine at an advantage. That is, from a per-edge basis, Cuckatoo32 solutions have more weight than Cuckatoo31 miners.

Because this bonus weight exists, the primary advantage of the phase-out of Cuckatoo31 to another manufacturer is that competition is eliminated. And that as the sole advantage goes against the stated goals of the Grin ASIC-friendly PoW algorithm - to foster competition between ASIC manufacturers.

The phase-out, to the best of my knowledge, does not exist to make miners obsolete, but instead to prevent single-die ASICs from entering the marketplace. But as Obelisk has discovered, Cuckatoo32 and Cuckatoo33 are both feasible to do without a 2x TMTO on a single-die chip using today’s technologies. Even where big investments have been made, I believe that the primary effect of eliminating the phase-out will be increasing competition overall.

I believe, based on several conversations I’ve had with people looking to buy Grin miners, Bitmain, Whatsminer, and Canaan all considered making a Grin ASIC, and all decided against it, with the primary reason being the fractured block reward. With the block reward being spread over 3 different algorithms (Cuckaroo29, Cuckatoo31, Cuckatoo32) over the typical expected lifetime of a miner, the incentive for manufacturers to enter is substantially reduced.

I believe that it would help the discussion a lot if Innosilicon published more information about their miner. As of now, we don’t even know the full range of algorithms that it is capable of targeting. Innosilicon continues to claim that they will be able to ship by the end of the summer. In order for that to be possible, Innosilicon will need to have finalized their architecture already and be working on implementation. So these questions at this point in time should all be reasonable:

  • Is the miner a mean miner or a lean miner?

  • Does the chip itself have a significant amount of memory on it?

  • Is the miner capable of targeting Cuckaroo29 in addition to Cuckatoo31 and Cuckatoo32?

  • How many graphs per second are you expecting (+/- 25%) on Cuckatoo31? Cuckatoo32? Cuckaroo29? And what ballpark (+/- 50%) are you expecting for power consumption?

It does not seem like Innosilicon is concerned about Obelisk’s single chip designs. Based on their own confidence, it does not seem like they believe nixing the phase-out would change the dynamics between Obelisk and Innosilicon, and as best I can tell there is no other ASIC manufacturer seriously considering Grin at this time.

4 Likes
Grin Improvement Proposal 1: put later phase outs on hold and rephrase primary PoW commitment
#15

In the semi industry there are a couple of acronyms used DTCO (Design Technology Co-Optimization) and PPAC (Power, Performance, Area, Cost). Time will tell who did their homework on this one. But the it feels like tailwinds are behind the single-die implementation. Innovation and out-of-box thinking will prevail.

1 Like
#16

The thing is: here we got two companies now with different strategies. One that will have a single chip doing C31 very quick but C32 will need an other chip, the other with a multi purpose chip, compatible to the phase out / in but likely slower on C31 or not that efficient.

Would it be fair to change the rules now?
The multi chip will have an advantage over GPUs if its released first and then again one when the phase starts and the single chip will rule C31 from release until the end. Note that phase out is slow, if you buy first batch its likely you got a livespan of 8-9 month until its unusable.

In this picture I see times working well for any investor and so I think cancelling the phase out would invalidate too many strategic desisions already taken. C31+ was picked asic “friendly” and with predictable future evolution rules to guaranty enough predictability so ASIC development is worth the effort. It should be sticked to that.

#17

Can you reveal at what process?

I had to look that up and found https://en.wikipedia.org/wiki/Non-recurring_engineering
I would expect that a nontrivial fraction of the design cost of a C31 chip carries over to a C32 chip though.

Does that imply moving to 7nm, or does TSMC offer an intermediate node like 10nm or 12nm that would accommodate C32?
There’s also the option of combining 512 MB of SRAM with 512 MB of EDRAM on a single chip, with negligible performance penalty, although at considerable increase in design complexity. With EDRAM 's much higher density, that should still fit on 16nm.

These statements are somewhat contradictory. The phase out of C31 is damaging to Innosilicon because they judged that, based on this phase out, a multi-chip C31+C32 miner made more economic sense, and thus made investments that depend on the elimination of competing C31-only designs.

The reward is split, in a predefined balance, between the primary Cuckatoo31+ PoW and the secondary cuckaroo29 PoW. It’s not split in the same sense between Cuckatoo31 and Cuckatoo32 because these are competing within the same slice. But you’re right that a manufacturer considering making a single chip C32 ASIC at 7nm is now incentivized to wait for C31 phase out, so that it needn’t compete with much lower cost single chip C31 ASICs at 16nm.

I still think that the idea of a PoW focussing on the memory IO bottleneck is a worthwhile one, but I now doubt that it’s a good idea for Cuckoo Cycle. It makes more sense for PoWs like Equihash or ethash, that have a natural affinity for DRAM.

Cuckoo Cycle is unique in having a natural affinity for SRAM and SRAM in turn has a natural affinity for use on-chip. Originally I thought the phase out schedule would lead to implementations comprising a central Cuckoo controller surrounded by many SRAM chips, with some semblance to a random-access optimized general purpose computer, which seemed appealing to me. But it may as easily end up as a humongous die in spirit that is forced to be spread in pieces across an interposer to get around die size and yield limitations. Those extra complexities may not be as beneficial for computing in general as I had imagined. And they may well be detrimental to competition, as @rodarmar and @Taek have pointed out.

In hindsight I think single chip ASICs are a better fit for Cuckoo Cycle, and the planned phase outs a mistake.

I hereby propose to change our upgrade schedule at the next planned hardfork (currently set for mid july) to include only the C31 phase out, and to correspondingly restrict our immutability commitment to a lifetime of 18 months.
That is to say, any change to our primary PoW cannot take effect until at least 18 months into the future, unless agreed upon by all affected parties.

If the multi-chip miners turn out to perform better than the single-chip ones on C32, then we can reconsider the C32 phase out at a later date.

#18

The process choice is not finalized, but we are currently considering TSMC 10nm.

Two of our biggest expenses are physical design and mask creation. Physical design is process specific, which means we lose all of our work when switching from 16nm to 10nm. Over a million dollars, and several months of work. Mask creation though is the one that really kills you. At 16nm, masks are on the order of $5 million. 10nm mask sets are even more expensive than 16nm masks, and those expenses do not translate at all - you have to pay the full price at each tape-out.

The mask prices are the main barrier to entry for competing in the semiconductor industry.

All of the design and architecture work we’ve done (6 months of design work and logical implementation) would translate very well. What took us six months and several million dollars the first time around would take us 6 weeks and several hundred thousand the second time around. It’s the masks that kills us on price and the physical layout that kills us on time.

TSMC offers 12nm, 10nm, and 7nm. 12nm is probably more fairly called “14.5nm”, and likely isn’t good enough for CC32, just not quite enough space for all the things we would want. 10nm is a lot better though, and also a fairly mature process with decent yields. Very likely our choice. It’s price is close to that of 7nm, but it’s not as exclusive or difficult to access as a small company.

A company with large-volume history (hundreds of millions in production) like Innosilicon or Bitmain would have a much easier time getting access to 7nm than a smaller company like Obelisk (tens of millions in production) would.

The way I currently understand it, which may not be correct, is that Innosilicon did not believe a single-die Cuckatoo31 chip was possible, and would have made the same miner regardless of the phase-out. I do not think that they compromised their CC31 performance in order to support CC32. But if they are not willing to reveal their architecture or decision making process, we will not know.

It’s definitely a much bigger issue for single-die chips than it is for multi-die chips. C31 is nice because it’s viable at 16nm, and therefore has more accessibility and lower tape-out cost. Going to 10nm brings you into another tier of competitiveness and cost which will be more restrictive.

If you stop at CC32 it would be good to just make that the permanent choice, barring some significant security issue. Any time you have a changeover, you have this game where manufacturers need to decide between chasing the closer one and chasing the further one. Changing to something like CC30 or another algorithm entirely will just re-awaken these issues.

Based on how Innosilicon is talking, they believe that they will be superior because of yields, which is something we won’t know for sure until chips are coming off of the line. Though, we’ve done our homework on the yields and have high confidence internally that it won’t be an issue.

#19

In how many years (roughly) can we expect the 10nm process to be similarly mature and accessible compared to the current 16nm process?

#20


Here is an aricle that might answer your question and gives general information about process trends.

#21

10nm is meant to be an in-between node, kind of like TSMC 20nm. At some point in terms of accessibility and cost it’ll be fully eclipsed by 7nm (similar to how TSMC 20nm got eclipsed by TSMC 16nm)

So the more relevant question would be how many years would it be until we can expect the 7nm process to be mature and accessible, and my guess is that it wouldn’t happen until 3nm is in full production, which I think would reasonably happen in either 2023 or 2024.