- Fix Dense.__repr__ to include the missing d (sequence length) parameter
- Default LayerNorm and BertLayer to use approximate rsqrt (approx=True) for better MPC performance
- Remove debug prints
- Fix SubMultiArray.__add__ to handle addition with plain arrays (check for sizes/size attributes)
- Fix SubMultiArray.__str__ to use self.address instead of self.array._address to avoid attribute errors
- Add LAYER_COMPARISON flag to bert_inference.mpc to skip the expensive layer-by-layer comparison section (~95% of compile time) by default