INDEX
Negative Logits
ANY
0.61
geniş
0.61
atakse
0.59
కుంట
0.57
넓
0.53
MRS
0.53
Garage
0.53
AKT
0.53
창
0.53
Dist
0.52
POSITIVE LOGITS
vieron
0.60
ாலி
0.59
良かった
0.58
Offer
0.57
rion
0.57
銛
0.56
Lew
0.56
දී
0.55
northward
0.55
Hir
0.55
Activations Density 0.000%