parent nodes: BitGrid | OmnipotentMike
The bitgrid first arose when I misunderstood the idea of a bit slice procesor back around 1981. It turned out to be far less useful and far more complicated than I supposed. My understanding of the term was a VERY simple processor, which operated on a bit slice of a problem.
Since this term is used, I've moved mine to Bit Grid to be unique, and google friendly. I've not found any instance of a bitgrid in any search to date ( 7/1/2010 08:36 AM )... lots of complicated things, but nothing this simple in concept.
The processing power of a single cell is trivial. Its very tempting to try to give them more, but that costs hugely in flexibility and elegance. A bitgrid chip will look pretty much like a crystal, very regular, with the only disruptions for I/O and RAM interface on the edges.
A bitgrid can route around failures, provided they do not extend across the whole chip.
Bitgrid allows you to unroll a complete program into hardware, so it can all be run in parallel, pipelined.
At writing ( 7/1/2010 08:40 AM ) there are many things which remain to be done. This is essentially a new computer architechure, on par with Von Neuman's model. The scope of what can be done with it expands past my imagination. The opprotunities here are vast. Yet, as with any new idea... it's a matter of finding a core who believe in it, and can get it reified.
A physical chip which works will be a very big milestone. It will cost at least $10,000 in direct costs, plus all of the time and effort to design, simulate, and evaluate the prototype. Testing should be fairly trivial, fortunately.
Because of the cost of prototyping, simulation is the preferred path for software development until that time. I'm still exploring the space to see what the basic parameters of the concept are. I've noted a few things already:
Implementation orthagonality: It doesn't matter if it's a simulator, or the latest chip technology, the program will run the same, with no changes. If a chip uses TI's new memory technology to store the lookup table, it doesn't effect the software.... it just makes it cheaper, faster, lower power, etc.
6/3/2011 06:43 AM - It should be possible to break down any given stateless logic to a set of equations each with a single bit result and a set of single bits in. These can be transformed into a set of statements of logic which have 4 or less inputs each. Those statements could be mapped to bitgrid cells. This provides a way to use a set of forumlae to program a bitgrid without having to route things by hand, or by some form of trial and error random logic mapping.
Symmetry: It's very easy to rotate and invert the layout of a program, automatically, and reversably.
Fault tollerance: A bad cell can be routed around, automatically and reversably.
Easy implementation: This should be a very easy to design chip. The lookup table entries are just static ram, and there are no long high-speed lines, as programming is a rare event, and can be done at leasure. All computing results only have to reach the next cell.
Scalability: Because the grid is a homogeneous grid, it can be expanded to any dimensions without complication.
Elegance: It might not count for much with some people, but a very elegant design is easy to think one's way around.
*Simple co-processing: A bitgrid could appear as a RAM to the host... can't get simpler.
I asked Metafilter for some feedback, and they did a great job... pointing out some things I handn't heard of, and pointing out that the BitGrid is just a low feature FPGA. I think they are right.... it's an FPGA without the lumps. FPGA gravy?
A bit of dithering in the transformation process could be done to aid in optimization of routing.
A bit of dithering in the mapping to cells process would be the first step in optimizing the routing.
A measure of delay can be done while mapping cells, it would provide a metric to minimize, along with jitter and spread across terms.
A naive mapping algorithm would be the basis to deviate from.
In the logical formula level:
Decomposition of logic follows a lot of basic rules, all of which just work and don't require proof once things are debugged. A final check can be done by interating all possible input combinations and checking them against the required output.
In the mapped to cell level:
Decomposition to cells generates a set of delay values, which need to be added up as ranges, so that jitter and skew can be estimated. Cost functions can come from the delay information, and the number of cells used.
Each cell has 4 outputs and inputs, each of which can only be used once, and is mapped to an adjacent cell. A set of functions which share inputs can share the cell and use different outputs.
I think the best routing would start at the output and work backwards. The second approach would be to start at the inputs and work forward. There will be cases when things don't fit... and it'll be necessary to tweak things by moving cells.
A one of the many display modes would be to give a color as the number of logical steps away from the result. Cycling the colors would show data flow.
The process of mapping to cells is the tricky bit, I think...
Let's say that each cell has an address [x ,y] and each input or output has an address like [x',y'] this makes it possible to route things.
This would transform the cells to a list of possible places to put formulae which doesn't involve actual geometery.
At the start of mapping, you'd have a list of transformation functions which haven't been routed, and a cell of places to put them.
Take one of the functions and put it into the grid, starting at one end or the other, mark it routed if it fits.
If it doesn't fit, try a different item from the list of transformation functions
If it doesn't fit... try extending by throwing on of the inputs or outputs out a cell with a replication entry.
Loop until it's all done.
Breaking things down...
1. Express desired program as a set of bit equations without state. (A big step from actual use, but we have to start somewhere)
2. Use automated transformation to get it down to a set of bit transformations with 4 inputs each or less.
3. If an equation has more than 4 inputs at to get a result, the breakdown into smaller pieces has multiple possible ways of happening... keep track of those combinations, as we may need to use alternatives in the routing process later.
4. Mapping to cells is the process of mapping transforms to the possible slots between xy and x'y'
5. Extending inputs and outputs is trivial, but is another place where there are multiple possible ways of happening, which need to be tracked and permuted in the routing process.
6. Once all the transforms are mapped to cells, metrics can be derived from cells used, delay values, delay ranges, delay skew.
7. Once things are mapped they can have multple possible mappings, which need to be optimized on some function of the costs.
WOW... this thing is possible, even with just me doing it. 8)
There are multiple ways to decompose and recompose bit equations, if they can go into lists or tables, then it's possible to interate across all of them, and find the overall transformations with the lowest cost.
New ideas go here, with timestamp
5/31/2016 06:03 AM - Imagine a bitgrid with a programmable latch on each output, if 0, the output was real time, if 1, it was clocked in an A/B type clocking arrangement.... you could trace all the inputs back, and as long as you didn't form a loop, you could turn each intermediate result's "Clock" bit to 0, to speed up processing. You'd then dynamically slow the master clock to allow for propagation delays. You could also count backwards and latch intermediate results in order to prevent skew at some point.
5/31/2016 06:09 AM - You could treat each bit output like a layer in a neural network, and the randomly twiddle the bits until you get the function that you need in all test cases.... once it works, you could then go and tweak each bit that doesn't effect output to reduce power consumption.