实现batch内负采样和解决报错：tensorflow.python.framework.errors

Python 更新时间：2026-03-18 19:06:54发布时间：1362天前百科书网趣学号

在我们实现双塔等模型时一些负样本难以获取，这时我们可以通过batch内负采样的方式来实现训练，此时采集的样本数据仅需要正样本就够了。

我们先来看一个案例，通过此案例来理解此代码

import tensorflow as tf
import random
batchSize = 4
NEG = 2
normalize = tf.keras.layers.Lambda(lambda x: tf.keras.backend.l2_normalize(x, axis=1))
item_y = tf.constant([[2, 3],[5, 6],[7,1],[3,4]], dtype=tf.float32)
item_y = normalize(item_y)
item_y_temp = item_y
user_y = tf.constant([[2,3],[5,6],[7,1],[3,4]], dtype=tf.float32)
user_y = normalize(user_y)
for i in range(NEG):
    rand = int((random.random() + i) * batchSize / NEG)
    print(rand)
    if rand == 0:
        rand += 1
    if rand == 4:
        rand = rand - 1
    item_y = tf.concat([item_y,
                        tf.slice(item_y_temp, [rand, 0], [batchSize - rand, -1]),
                        tf.slice(item_y_temp, [0, 0], [rand, -1])], 0)
    print(item_y)
user_y_test = tf.tile(user_y, [NEG + 1, 1])
print(user_y_test)
prod_raw = tf.reduce_sum(tf.multiply(tf.tile(user_y, [NEG + 1, 1]), item_y), 1, True)
print(prod_raw)
prod = tf.transpose(tf.reshape(tf.transpose(prod_raw), [NEG + 1, batchSize]))
print(prod)

运行这段代码的输出是：

0
tf.Tensor(
[[0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.55470014 0.8320502 ]], shape=(8, 2), dtype=float32)
2
tf.Tensor(
[[0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.55470014 0.8320502 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]], shape=(12, 2), dtype=float32)
tf.Tensor(
[[0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]
 [0.55470014 0.8320502 ]
 [0.6401844  0.7682213 ]
 [0.98994946 0.14142135]
 [0.6        0.8       ]], shape=(12, 2), dtype=float32)
tf.Tensor(
[[0.99999976]
 [1.0000001 ]
 [0.99999994]
 [1.        ]
 [0.99430907]
 [0.7423931 ]
 [0.70710677]
 [0.9984603 ]
 [0.6667947 ]
 [0.99868774]
 [0.6667947 ]
 [0.99868774]], shape=(12, 1), dtype=float32)
tf.Tensor(
[[0.99999976 0.99430907 0.6667947 ]
 [1.0000001  0.7423931  0.99868774]
 [0.99999994 0.70710677 0.6667947 ]
 [1.         0.9984603  0.99868774]], shape=(4, 3), dtype=float32)

此时我们将此代码用于我们的模型之中，然后加上softmax，loss改为category_loss代码就可运行一半了，但是我们发现仿佛在一个epoch快要运行结束时报错了，报错为：

tensorflow.python.framework.errors_impl.InvalidArgumentError:  Expected size[0] in [0, 252], but got 906
	 [[node dssm/Slice (defined at /data1/starfm/starfmRec/model/dssm/dssm.py:70) ]] [Op:__inference_train_function_55102]

Errors may have originated from an input operation.
Input Source operations connected to node dssm/Slice:
 dssm/lambda_1/l2_normalize (defined at /data1/starfm/starfmRec/model/dssm/dssm.py:42)

它的原因为：我们的数据被分为一个batch_size,一个batch_size的输入模型中，在最后总会剩余不足一个batch的情况，这个数据输入会导致我们原先对设定的batch_size相关的维度计算错误，我修改的措施是丢掉不足一个batch的数据，实现方法为：

dataset = dataset.batch(batch_size, drop_remainder=True)

在该函数中添加参数drop_remainder为true，这样模型就可以成功运行了。

实现batch内负采样和解决报错：tensorflow.python.framework.errors

Python相关栏目本月热门文章