Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mixture of experts Example #260

Open
kiranmaya opened this issue Sep 5, 2024 · 1 comment
Open

Mixture of experts Example #260

kiranmaya opened this issue Sep 5, 2024 · 1 comment

Comments

@kiranmaya
Copy link

kiranmaya commented Sep 5, 2024

need MoE example, ChatGPT wrote some thing ,but ,I don't thinking its a good gating network
`using System;
using Tensorflow;
using static Tensorflow.Binding;
using Tensorflow.Keras;
using Tensorflow.Keras.Layers;
using Tensorflow.Keras.Models;

class Program
{
static void Main(string[] args)
{
int inputDim = 10;
int outputDim = 1;
int numExperts = 3;

    // Create experts
    var experts = new Sequential[numExperts];
    for (int i = 0; i < numExperts; i++)
    {
        experts[i] = CreateExpert(inputDim, outputDim);
    }

    // Input tensor
    var inputLayer = keras.Input(shape: new int[] { inputDim });
    
    // Output from dynamic routing
    var output = DynamicRouting(inputLayer, experts, numExperts);
    
    // Build model
    var moeModel = keras.Model(inputLayer, output);
    moeModel.compile(optimizer: keras.optimizers.Adam(), loss: "mse");

    moeModel.summary(); // Optional: to view the model summary
}

static Sequential CreateExpert(int inputDim, int outputDim)
{
    var model = keras.Sequential();
    model.add(keras.Input(shape: new int[] { inputDim }));
    model.add(new Dense(64, activation: keras.activations.Relu));
    model.add(new Dense(outputDim, activation: keras.activations.Linear));
    return model;
}

static Tensor DynamicRouting(Tensor inputTensor, Sequential[] experts, int numExperts, int numActions = 2)
{
    // Policy network to select experts dynamically
    var policyLayer = new Dense(numActions, activation: keras.activations.Softmax).Apply(inputTensor);
    
    // Expert outputs
    var expertOutputs = new Tensor[numExperts];
    for (int i = 0; i < numExperts; i++)
    {
        expertOutputs[i] = experts[i].Apply(inputTensor);
    }

    // Selecting the expert with the highest probability
    var selectedExpert = tf.argmax(policyLayer, axis: 1);
    
    // Gather the output from the selected expert
    var output = tf.gather(expertOutputs, selectedExpert);
    
    return output;
}

}
`

Copy link

github-actions bot commented Dec 3, 2024

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant