How to correctly create a PyTorch Tensor from a Pandas DataFrame?

Asked 3 years, 2 months ago

Viewed 3k times

I have loaded my data into a Pandas DataFrame, and performed some pre-processing, and then I need to convert it into a PyTorch Tensor for training as my features data.

Obviously, This new tensor do NOT need auto-gradient with it, because it is only source data.

I convert the df into a tensor like follows:

features = torch.tensor( data = df.iloc[:, 1:cols].values, requires_grad = False )

I dare NOT use torch.from_numpy(), as that the tensor will share the storing space with the source numpy.ndarray according to the PyTorch's docs.

Not only the source ndarray is a temporary obj, but also the original DataFrame will be released before training, because it is huge.

Further more, I'm worrying about the training performance, so I want to my feature data is really stored in a Tensor, not in some form of 'View', or sharing space with ndarray/df.

I'm confused by the PyTorch's docs, since it says that from_numpy will sharing space, and that torch.Tensor.clone() will carry gradient, and if detach() be used, one more copying will occur.

I just need to create a neat Tensor owning data, without gradient, and at best, with less data copying op.

Is my method correct? or is there any better way?

asked Sep 19, 2022 at 20:52

EvilRoach

1631 silver badge4 bronze badges

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

How to correctly create a PyTorch Tensor from a Pandas DataFrame?

0

Hot Network Questions

How to correctly create a PyTorch Tensor from a Pandas DataFrame?

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions