Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiTask data preparation for training keras model #458

Open
Devashish13 opened this issue Nov 7, 2024 · 0 comments
Open

MultiTask data preparation for training keras model #458

Devashish13 opened this issue Nov 7, 2024 · 0 comments

Comments

@Devashish13
Copy link

Devashish13 commented Nov 7, 2024

Hi,

I am trying to build a GNN model for training over multiple graphs. I am working with keras and found that spektral provides a nice functionality for data preparation and model training with keras.

In my specific task I aim to perform classification and regression for each node. Here is the code when I am doing it with a single graph without using the loader functionality

lambda1 = 0.3
lambda2 = 0.7

x_in = Input(shape=(graph.x.shape[1],)) # input attributes
a_in = Input(shape=(graph.a.shape[0],)) # input adjacency matrix
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_in)
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_1)
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_1)

x_1 = GCNConv(256, activation="relu")([x_1, a_in])
x_1 = layers.Dense(100, activation='relu')(x_1)
output1 = layers.Dense(1220, activation='sigmoid',name = "classification")(x_1)
output2 = layers.Dense(1220,activation = "softplus", name = "regression")(x_1)

model = Model([x_in, a_in], [output1,output2])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
loss={
"classification": tf.keras.losses.BinaryCrossentropy(),
"regression": tf.keras.losses.MeanSquaredError(),
},
loss_weights={"classification": lambda1, "regression": lambda2}
)

model.fit([graph.x, graph.a], [graph.y[:,:1220],graph.y[:,1220:]], epochs = 200)

this works perfectly fine.

model.fit([graph.x, graph.a], [graph.y[:,:1220],graph.y[:,1220:]], epochs = 200)
Epoch 1/200
1/1 [==============================] - 1s 1s/step - loss: 1.8263 - classification_loss: 0.7432 - regression_loss: 2.2904
Epoch 2/200
1/1 [==============================] - 0s 6ms/step - loss: 1.4579 - classification_loss: 0.6213 - regression_loss: 1.8165
Epoch 3/200
1/1 [==============================] - 0s 5ms/step - loss: 6.7970 - classification_loss: 0.5442 - regression_loss: 9.4768
Epoch 4/200
1/1 [==============================] - 0s 6ms/step - loss: 1.2732 - classification_loss: 0.4492 - regression_loss: 1.6263
Epoch 5/200
1/1 [==============================] - 0s 5ms/step - loss: 1.3546 - classification_loss: 0.4543 - regression_loss: 1.7405

If I perform only regression/classification then I can easily work with single loader. The problem comes when I want to perform both task because loader.load() provides a single tensor instead of a list of two outputs.

dataset = DatasetRead(id_list=single_id, graph_path=graph_path)
loader = SingleLoader(dataset, epochs=1) # Modify batch_size as per your needs

lambda1 = 0.5
lambda2 = 0.5

x_in = Input(shape=(514,)) # Variable input shape (None,) for feature vector per node
a_in = Input(shape=(None,)) # Variable input shape (None,) for adjacency matrix
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_in)
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_1)
x_1 = tf.keras.layers.Dense(100, activation='relu')(x_1)

x_1 = GCNConv(256, activation="relu")([x_1, a_in])
x_1 = tf.keras.layers.Dense(300, activation='relu')(x_1)

output1 = tf.keras.layers.Dense(1220, activation='sigmoid', name="classification")(x_1)
output2 = tf.keras.layers.Dense(1220, activation='softplus', name="regression")(x_1)

model = Model(inputs=[x_in, a_in], outputs=[output1,output2])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss={"classification": tf.keras.losses.BinaryCrossentropy(),
"regression": tf.keras.losses.MeanSquaredError()},
loss_weights={"classification": lambda1, "regression": lambda2}
)

model.fit(loader.load(), steps_per_epoch=loader.steps_per_epoch, epochs=100)

here is the error

Epoch 1/100
Traceback (most recent call last):
File "", line 1, in
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/tmp/autograph_generated_filevlmbbxzy.py", line 15, in tf__train_function
retval
= ag
_.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/training.py", line 1249, in train_function  *
    return step_function(self, iterator)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/training.py", line 1233, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/training.py", line 1222, in run_step  **
    outputs = model.train_step(data)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/training.py", line 1024, in train_step
    loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/training.py", line 1082, in compute_loss
    return self.compiled_loss(
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/engine/compile_utils.py", line 265, in __call__
    loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/losses.py", line 152, in __call__
    losses = call_fn(y_true, y_pred)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/losses.py", line 284, in call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/losses.py", line 2176, in binary_crossentropy
    backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
File "/ndata/home/genomeindia/gi-advance/gi6/.conda/envs/dt2_conda/lib/python3.9/site-packages/keras/backend.py", line 5680, in binary_crossentropy
    return tf.nn.sigmoid_cross_entropy_with_logits(

ValueError: `logits` and `labels` must have the same shape, received ((24, 1220) vs (24, 2440)).

Please let me know if I can somehow modify the loader class to deal with this issue or if there is any other way out. Eventually, I aim to perform training over multiple graphs.

@Devashish13 Devashish13 changed the title MultiTask datapreparation for training keras model MultiTask data preparation for training keras model Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant