We include an inefficient reference PyTorch implementation in gpt_oss/torch/design.py. This code employs standard PyTorch operators to show the precise design architecture, with a small addition of supporting tensor parallelism in MoE so the greater model can operate with this particular code (e.I regret to tell you that I am struggling to share th