INDEX
Explanations
public perception and opinions
New Auto-Interp
Negative Logits
luscious
0.47
citrus
0.43
тыся
0.43
zealand
0.42
蕤
0.42
willfully
0.42
succulent
0.41
willful
0.39
tanger
0.39
0.39
POSITIVE LOGITS
perceptions
0.54
unpopular
0.49
воспринима
0.44
perception
0.43
perceive
0.43
favorables
0.41
público
0.40
pubblico
0.40
survey
0.39
Perception
0.39
Activations Density 0.038%