这个错误遇到两次了,第一次是因为网络线性层的原因,没有注意到维度算错了。 这次又出现了,我时刻铭记线性层,看了半天网络,算了又算,维度没错啊!!!甚至这个网络训练其他数据也跑得起。然后打印的是第一个卷积层都没有打印出来,就去找第一个Block,发现传入的边特征是3维。。。。数据换了我传进来的是2维。。改了好了 果然,代码得一行一行看,不要想当然错误应该在那里然后觉得没问题

原报错信息如下:

********** start build dataloader **********

train data number:50000,eval data number:10000

********** start training

** On entry to SGEMM parameter number 10 had an illegal value

** On entry to SGEMM parameter number 10 had an illegal value

** On entry to SGEMM parameter number 10 had an illegal value

** On entry to SGEMM parameter number 10 had an illegal value

Traceback (most recent call last):

File "/home/ubuntu/lxd-workplace/ldfde/new_new/Graph/cifar10/bas1/train_me.py", line 447, in

train(args)

File "/home/ubuntu/lxd-workplace/ldfde/new_new/Graph/cifar10/bas1/train_me.py", line 334, in train

outputs=model(sample_batched)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl

return forward_call(*input, **kwargs)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch_geometric/nn/data_parallel.py", line 70, in forward

outputs = self.parallel_apply(replicas, inputs, None)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply

return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply

output.reraise()

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/_utils.py", line 425, in reraise

raise self.exc_type(msg)

RuntimeError: Caught RuntimeError in replica 0 on device 0.

Original Traceback (most recent call last):

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker

output = module(*input, **kwargs)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl

return forward_call(*input, **kwargs)

File "/home/ubuntu/lxd-workplace/ldfde/new_new/Graph/cifar10/bas1/model_eq.py", line 62, in forward

x = self.block1(x,adj,edge_attr)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl

return forward_call(*input, **kwargs)

File "/home/ubuntu/lxd-workplace/ldfde/new_new/Graph/cifar10/bas1/model_eq.py", line 28, in forward

x = self.conv0(x, adj,edge_attr=edge_attr)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl

return forward_call(*input, **kwargs)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch_geometric/nn/conv/gat_conv.py", line 238, in forward

alpha = self.edge_updater(edge_index, alpha=alpha, edge_attr=edge_attr)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch_geometric/nn/conv/message_passing.py", line 501, in edge_updater

out = self.edge_update(**edge_kwargs)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch_geometric/nn/conv/gat_conv.py", line 269, in edge_update

edge_attr = self.lin_edge(edge_attr)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl

return forward_call(*input, **kwargs)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch_geometric/nn/dense/linear.py", line 136, in forward

return F.linear(x, self.weight, self.bias)

File "/home/ubuntu/miniconda3/envs/pt_xt/lib/python3.9/site-packages/torch/nn/functional.py", line 1847, in linear

return torch._C._nn.linear(input, weight, bias)

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

接好运,跑代码都顺利

精彩链接

评论可见,请评论后查看内容,谢谢!!!
 您阅读本篇文章共花了: