
在我们实现双塔等模型时一些负样本难以获取,这时我们可以通过batch内负采样的方式来实现训练,此时采集的样本数据仅需要正样本就够了。
我们先来看一个案例,通过此案例来理解此代码
import tensorflow as tf
import random
batchSize = 4
NEG = 2
normalize = tf.keras.layers.Lambda(lambda x: tf.keras.backend.l2_normalize(x, axis=1))
item_y = tf.constant([[2, 3],[5, 6],[7,1],[3,4]], dtype=tf.float32)
item_y = normalize(item_y)
item_y_temp = item_y
user_y = tf.constant([[2,3],[5,6],[7,1],[3,4]], dtype=tf.float32)
user_y = normalize(user_y)
for i in range(NEG):
rand = int((random.random() + i) * batchSize / NEG)
print(rand)
if rand == 0:
rand += 1
if rand == 4:
rand = rand - 1
item_y = tf.concat([item_y,
tf.slice(item_y_temp, [rand, 0], [batchSize - rand, -1]),
tf.slice(item_y_temp, [0, 0], [rand, -1])], 0)
print(item_y)
user_y_test = tf.tile(user_y, [NEG + 1, 1])
print(user_y_test)
prod_raw = tf.reduce_sum(tf.multiply(tf.tile(user_y, [NEG + 1, 1]), item_y), 1, True)
print(prod_raw)
prod = tf.transpose(tf.reshape(tf.transpose(prod_raw), [NEG + 1, batchSize]))
print(prod)
运行这段代码的输出是:
0 tf.Tensor( [[0.55470014 0.8320502 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.55470014 0.8320502 ]], shape=(8, 2), dtype=float32) 2 tf.Tensor( [[0.55470014 0.8320502 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.55470014 0.8320502 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.55470014 0.8320502 ] [0.6401844 0.7682213 ]], shape=(12, 2), dtype=float32) tf.Tensor( [[0.55470014 0.8320502 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.55470014 0.8320502 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ] [0.55470014 0.8320502 ] [0.6401844 0.7682213 ] [0.98994946 0.14142135] [0.6 0.8 ]], shape=(12, 2), dtype=float32) tf.Tensor( [[0.99999976] [1.0000001 ] [0.99999994] [1. ] [0.99430907] [0.7423931 ] [0.70710677] [0.9984603 ] [0.6667947 ] [0.99868774] [0.6667947 ] [0.99868774]], shape=(12, 1), dtype=float32) tf.Tensor( [[0.99999976 0.99430907 0.6667947 ] [1.0000001 0.7423931 0.99868774] [0.99999994 0.70710677 0.6667947 ] [1. 0.9984603 0.99868774]], shape=(4, 3), dtype=float32)
此时我们将此代码用于我们的模型之中,然后加上softmax,loss改为category_loss代码就可运行一半了,但是我们发现仿佛在一个epoch快要运行结束时报错了,报错为:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[0] in [0, 252], but got 906 [[node dssm/Slice (defined at /data1/starfm/starfmRec/model/dssm/dssm.py:70) ]] [Op:__inference_train_function_55102] Errors may have originated from an input operation. Input Source operations connected to node dssm/Slice: dssm/lambda_1/l2_normalize (defined at /data1/starfm/starfmRec/model/dssm/dssm.py:42)
它的原因为:我们的数据被分为一个batch_size,一个batch_size的输入模型中,在最后总会剩余不足一个batch的情况,这个数据输入会导致我们原先对设定的batch_size相关的维度计算错误,我修改的措施是丢掉不足一个batch的数据,实现方法为:
dataset = dataset.batch(batch_size, drop_remainder=True)
在该函数中添加参数drop_remainder为true,这样模型就可以成功运行了。