I am trying to perform segmentation of coils on PC motherboards:
I have around 150 training samples; some have more coils or bigger coils and some have less, but on every image there is at least one coil.
I tried to realize the segmentation with a VGG-net that got a fully convolutional network on top of it. Without the fully connected layers of the VGG. Before that, I tried something similar with a way smaller architecture with the same outcome.
What I realized in both cases is that the net pretty fast starts to predict that the whole images is background. I thought now that this is the case because the coils are only a very small part of the whole image and for the net it is easier to optimize in the direction of an all black image.
I use as criterion BCEWithLogitsLoss and as
optimizer RMSprop. Bellow are my statistics with the vggnet+fcn. In the IoUs array is the first value for the coils and the second for the background.(Source code with the vgg+fcn part from for my second try). I used vgg11 and just the normal fcn.
Can't I realize a segmentation with that dataset? Is the dataset too small? What else could be a reasons for this results?
This problem can certainly be easily solved using convolutional neural networks. Actually, this is what they are pretty good at. Few possible reasons why your network does not work come to my mind:
Try using weighted loss binomial cross entropy function. Segmentation learning may easily fail when one class occupies larger portions of the image, compared to the other one. Alternatively, try using IoU or Dice coefficient as loss function.
150 images is pretty small dataset. Make sure you use at least a pre-trained network, and even then consider getting more training data. Also, make sure you use data augmentation: It is a cheap way to get a lot more data.
- Solved – Why is it possible to train a semantic segmentation neural network like U-net/Tiramisu from scratch using small data-set like few hundreds
- Solved – Does not being able to overfit a single training sample mean that the neural network architecure or implementation is wrong?
- Solved – Different input size for training and prediction in CNN for image segmentation
- Solved – Deep learning models for unsupervised semantic segmentation
- Solved – Encoder Decoder networks with varying image sizes