decoder/emb/token_emb/weight: normal(0, 1.0 / fan_out), shape=[32768, 4096], axes=FanAxes(in_axis=-2, out_axis=-1, batch_axis=())
decoder/transformer/repeat/layer/self_attention/norm/scale: constant(1.0)
decoder/transformer/repeat/layer/self_attention/attention/i_proj/i_proj/qkv_proj/weight: normal(0, 1.0 / fan_in), shape=(4096, 32, 128), axes=FanAxes(in_axis=0, out_axis=(1, 2), batch_axis=())
decoder/transformer/repeat/layer/self_attention/attention/o_proj/weight: normal(0, 1.0 / fan_in), shape=(4096, 32, 128), axes=FanAxes(in_axis=(1, 2), out_axis=0, batch_axis=())
decoder/transformer/repeat/layer/feed_forward/norm/scale: constant(1.0)
decoder/transformer/repeat/layer/feed_forward/linear1_0/weight: normal(0, 1.0 / fan_in), shape=(4096, 11008), axes=FanAxes(in_axis=-2, out_axis=-1, batch_axis=())
decoder/transformer/repeat/layer/feed_forward/linear1_1/weight: normal(0, 1.0 / fan_in), shape=(4096, 11008), axes=FanAxes(in_axis=-2, out_axis=-1, batch_axis=())
decoder/transformer/repeat/layer/feed_forward/linear2/weight: normal(0, 1.0 / fan_in), shape=(11008, 4096), axes=FanAxes(in_axis=-2, out_axis=-1, batch_axis=())
decoder/output_norm/scale: constant(1.0)