INDEX
Explanations
guard, vigilance, and protection
New Auto-Interp
Negative Logits
)
0.86
s
0.70
ne
0.68
d
0.66
ir
0.66
na
0.66
in
0.65
.")
0.64
computeEncoder
0.63
u
0.62
POSITIVE LOGITS
guard
0.96
guards
0.92
guarding
0.86
guards
0.86
vigilance
0.82
Guards
0.80
Guard
0.79
охра
0.78
vigilancia
0.78
patrol
0.72
Activations Density 0.022%