In the past few years the popularity of the autonomous sensory meridian response (ASMR) concept has risen exponentially, being the video format the most common source for its stimuli. These are videos made by content creators who generate trigger sounds by using multiple materials and techniques. Therefore, in this study we propose variations of the WaveRNN and WaveNet models designed to generate these trigger sounds from scratch. We observe that the baseline architecture of these models outperform the alternative conditioned models in terms of the quality of the generated audios.