[en] Reducing waiting time due to scheduling process and exploiting multi-access transmission, grant-free non-orthogonal multiple access (GF-NOMA) has been considered as a promising access technology for URLLC-enabled 5G system with strict requirements on reliability and latency. However, GF-NOMAbased systems can suffer from severe interference caused by the grant-free (GF) access manner which may degrade the system performance and violate the URLLC-related requirements. To overcome this issue, the paper proposes a novel reinforcementlearning (RL)-based random access (RA) protocol based on which each device can learn from the previous decision and its corresponding performance to select the best subchannels and transmit power level for data transmission to avoid strong cross-interference. The learning-based framework is developed to maximize the system access efficiency which is defined as the ratio between the number of successful transmissions and the number of subchannels. Simulation results show that our proposed framework can improve the system access efficiency significantly in overloaded scenarios.