Dynamic gesture recognition method based on Improved R(2+1)D

Yupeng Huo; Jie Shen; Sheng Zhang; Li Wang

doi:10.1109/CVIDL58838.2023.10166120

Dynamic gesture recognition method based on Improved R(2+1)D

Yupeng Huo, Jie Shen, Sheng Zhang, Li Wang

电气工程与控制科学学院

Nanjing Tech University

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

1 引用（Scopus）

摘要

Currently, methods based on 3D convolutional neural networks have made significant progress in the field of dynamic gesture recognition. Dynamic gestures are highly redundant in both the temporal and spatial dimensions, and the complex environment during the recognition process can easily affect the final recognition results. Therefore, it is crucial to make the model focus on the important moments and regions of gesture movements and extract relevant salient spatiotemporal features to further improve model performance. To address this issue, this paper proposes a lightweight Temporal-Spatial-Channel attention (TSCA) module based on the R(2+1)D network. The module consists of two sub-modules: a Temporal-Channel attention (TCA) module and a Temporal-Spatial attention (TSA) module, with the goal of enabling the model to focus on important information along the spatial, channel, and temporal dimensions during gesture movements. Finally, the TSCA attention module is integrated into the R(2+1)D network, resulting in only a 2.8M increase in parameters, and achieves good performance on the IPN-Hand and NvGesture datasets.

源语言	英语
主期刊名	2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023
出版商	Institute of Electrical and Electronics Engineers Inc.
页	323-328
页数	6
ISBN（电子版）	9798350326444
DOI	https://doi.org/10.1109/CVIDL58838.2023.10166120
出版状态	已出版 - 2023
活动	4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023 - Zhuhai, 中国期限: 12 5月 2023 → 14 5月 2023

出版系列

姓名	2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023

会议

会议	4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023
国家/地区	中国
市	Zhuhai
时期	12/05/23 → 14/05/23

访问文件

10.1109/CVIDL58838.2023.10166120

其它文件与链接

链接到 Scopus 的出版物

引用此

Huo, Y., Shen, J., Zhang, S., & Wang, L. (2023). Dynamic gesture recognition method based on Improved R(2+1)D. 在 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023 (页码 323-328). (2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CVIDL58838.2023.10166120

@inproceedings{0b2181de7406436c8b562ec1373f546d,

title = "Dynamic gesture recognition method based on Improved R(2+1)D",

abstract = "Currently, methods based on 3D convolutional neural networks have made significant progress in the field of dynamic gesture recognition. Dynamic gestures are highly redundant in both the temporal and spatial dimensions, and the complex environment during the recognition process can easily affect the final recognition results. Therefore, it is crucial to make the model focus on the important moments and regions of gesture movements and extract relevant salient spatiotemporal features to further improve model performance. To address this issue, this paper proposes a lightweight Temporal-Spatial-Channel attention (TSCA) module based on the R(2+1)D network. The module consists of two sub-modules: a Temporal-Channel attention (TCA) module and a Temporal-Spatial attention (TSA) module, with the goal of enabling the model to focus on important information along the spatial, channel, and temporal dimensions during gesture movements. Finally, the TSCA attention module is integrated into the R(2+1)D network, resulting in only a 2.8M increase in parameters, and achieves good performance on the IPN-Hand and NvGesture datasets.",

keywords = "Gesture Recognition, R(2+1)D Convolution, Self-Attention",

author = "Yupeng Huo and Jie Shen and Sheng Zhang and Li Wang",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023 ; Conference date: 12-05-2023 Through 14-05-2023",

year = "2023",

doi = "10.1109/CVIDL58838.2023.10166120",

language = "英语",

series = "2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "323--328",

booktitle = "2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023",

address = "美国",

}

Huo, Y, Shen, J, Zhang, S & Wang, L 2023, Dynamic gesture recognition method based on Improved R(2+1)D. 在 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023. 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023, Institute of Electrical and Electronics Engineers Inc., 页码 323-328, 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023, Zhuhai, 中国, 12/05/23. https://doi.org/10.1109/CVIDL58838.2023.10166120

Dynamic gesture recognition method based on Improved R(2+1)D. / Huo, Yupeng; Shen, Jie; Zhang, Sheng 等.
2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023. Institute of Electrical and Electronics Engineers Inc., 2023. 页码 323-328 (2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Dynamic gesture recognition method based on Improved R(2+1)D

AU - Huo, Yupeng

AU - Shen, Jie

AU - Zhang, Sheng

AU - Wang, Li

PY - 2023

Y1 - 2023

N2 - Currently, methods based on 3D convolutional neural networks have made significant progress in the field of dynamic gesture recognition. Dynamic gestures are highly redundant in both the temporal and spatial dimensions, and the complex environment during the recognition process can easily affect the final recognition results. Therefore, it is crucial to make the model focus on the important moments and regions of gesture movements and extract relevant salient spatiotemporal features to further improve model performance. To address this issue, this paper proposes a lightweight Temporal-Spatial-Channel attention (TSCA) module based on the R(2+1)D network. The module consists of two sub-modules: a Temporal-Channel attention (TCA) module and a Temporal-Spatial attention (TSA) module, with the goal of enabling the model to focus on important information along the spatial, channel, and temporal dimensions during gesture movements. Finally, the TSCA attention module is integrated into the R(2+1)D network, resulting in only a 2.8M increase in parameters, and achieves good performance on the IPN-Hand and NvGesture datasets.

AB - Currently, methods based on 3D convolutional neural networks have made significant progress in the field of dynamic gesture recognition. Dynamic gestures are highly redundant in both the temporal and spatial dimensions, and the complex environment during the recognition process can easily affect the final recognition results. Therefore, it is crucial to make the model focus on the important moments and regions of gesture movements and extract relevant salient spatiotemporal features to further improve model performance. To address this issue, this paper proposes a lightweight Temporal-Spatial-Channel attention (TSCA) module based on the R(2+1)D network. The module consists of two sub-modules: a Temporal-Channel attention (TCA) module and a Temporal-Spatial attention (TSA) module, with the goal of enabling the model to focus on important information along the spatial, channel, and temporal dimensions during gesture movements. Finally, the TSCA attention module is integrated into the R(2+1)D network, resulting in only a 2.8M increase in parameters, and achieves good performance on the IPN-Hand and NvGesture datasets.

KW - Gesture Recognition

KW - R(2+1)D Convolution

KW - Self-Attention

UR - http://www.scopus.com/inward/record.url?scp=85166347983&partnerID=8YFLogxK

U2 - 10.1109/CVIDL58838.2023.10166120

DO - 10.1109/CVIDL58838.2023.10166120

M3 - 会议稿件

AN - SCOPUS:85166347983

T3 - 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023

SP - 323

EP - 328

BT - 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023

Y2 - 12 May 2023 through 14 May 2023

ER -

Huo Y, Shen J, Zhang S, Wang L. Dynamic gesture recognition method based on Improved R(2+1)D. 在 2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023. Institute of Electrical and Electronics Engineers Inc. 2023. 页码 323-328. (2023 4th International Conference on Computer Vision, Image and Deep Learning, CVIDL 2023). doi: 10.1109/CVIDL58838.2023.10166120

Dynamic gesture recognition method based on Improved R(2+1)D

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此