|
|
(26 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
− | === SOCIS x264 Profile === | + | {{Lowercase}} |
| + | =ESA SOCIS Main Page= |
| | | |
− | CPU: ARM V7 PMNC, speed 0 MHz (estimated)
| + | == Plans== |
| + | First of all x264 is profiled (this has been done and a link to the profiles are below) |
| | | |
− | Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 100000
| + | <br> |
| | | |
− | <pre>samples % image name symbol name
| + | === Profiles === |
− | 9764 17.8387 x264 mc_chroma
| + | [[X264 SOCIS/profiles]] |
− | 3132 5.7221 x264 x264_pixel_avg2_w16_neon
| + | |
− | 2706 4.9438 x264 x264_me_search_ref
| + | <br> |
− | 2697 4.9274 x264 refine_subpel
| + | |
− | 2490 4.5492 x264 x264_quant_4x4_trellis
| + | === Conversions === |
− | 2089 3.8166 x264 x264_pixel_avg2_w8_neon
| + | Functions that are normally C that are intended for asm will have NEON implementations by the end of the project Every DSP function will also have a NEON implementation |
− | 2014 3.6795 x264 x264_pixel_satd_8x4
| + | |
− | 1959 3.5791 x264 get_ref_neon
| + | <br> |
− | 1309 2.3915 x264 x264_pixel_sad_16x16_neon
| + | |
− | 1125 2.0554 x264 x264_macroblock_encode
| + | === List of functions to be implemented in NEON === |
− | 1089 1.9896 x264 x264_macroblock_analyse
| + | {{SERVER}}/X264_SOCIS/todo |
− | 807 1.4744 x264 x264_satd_8x4v_8x8h_neon
| + | <br> |
− | 780 1.4250 x264 x264_rd_cost_mb
| + | |
− | 724 1.3227 x264 x264_satd_8x8_neon
| + | == Timeline == |
− | 679 1.2405 x264 x264_pixel_sad_x4_16x16_neon
| + | <u>August 1st - 14th:</u> Learn ARM architecture and ARM assembly |
− | 672 1.2277 x264 x264_pixel_sad_x4_8x8_neon
| + | |
− | 638 1.1656 x264 x264_pixel_satd_4x4_neon
| + | <u>August 15th - 21st:</u> Start learning NEON while writing NEON functions |
− | 634 1.1583 x264 x264_macroblock_cache_load_progressive
| + | |
− | 633 1.1565 x264 x264_pixel_satd_4x4
| + | <u>August 22nd - Onwards:</u> Move onto harder functions''<br>'' |
− | 601 1.0980 x264 x264_pixel_sad_8x8_neon
| + | |
− | 571 1.0432 x264 x264_quant_4x4_neon
| + | <u>Project end date:</u> 28th October 2011 |
− | 528 0.9646 x264 x264_slicetype_mb_cost
| + | |
− | 500 0.9135 x264 x264_mb_predict_mv
| + | <br> |
− | 492 0.8989 x264 x264_pixel_satd_8x8_neon
| + | |
− | 473 0.8642 x264 x264_mb_analyse_intra
| + | ''List will be updated shortly'' |
− | 471 0.8605 x264 x264_mb_encode_8x8_chroma
| + | |
− | 454 0.8295 x264 x264_pixel_sad_x3_16x16_neon
| + | <br> |
− | 452 0.8258 x264 x264_macroblock_tree_propagate
| + | |
− | 404 0.7381 x264 x264_pixel_sad_x3_8x8_neon
| + | == Progress == |
− | 395 0.7217 x264 x264_satd_16x4_neon
| + | My progress to date: |
− | 386 0.7052 libc-2.9.so /lib/libc-2.9.so
| + | |
− | 367 0.6705 x264 x264_mb_predict_mv_ref16x16
| + | *Learned ARM assembly |
− | 353 0.6449 x264 x264_sub8x4_dct_neon
| + | |
− | 309 0.5645 x264 x264_macroblock_cache_save
| + | <br> |
− | 293 0.5353 x264 x264_hadamard_ac_8x8_neon
| + | |
− | 281 0.5134 x264 x264_analyse_update_cache
| + | |
− | 280 0.5116 x264 x264_mb_analyse_inter_b8x8_mixed_ref
| + | |
− | 276 0.5042 x264 x264_slice_write
| + | == Git Access == |
− | 270 0.4933 x264 x264_cabac_encode_decision_c
| + | My git can be accessed here: |
− | 267 0.4878 x264 x264_mb_analyse_inter_b16x16
| + | |
− | 253 0.4622 x264 x264_cabac_mb_mvd
| + | <code>https://github.com/samdunne/x264/tree/socis-dev</code> |
− | 235 0.4293 x264 block_residual_write_cabac
| + | |
− | 223 0.4074 x264 x264_decimate_score16
| + | <br> Or if you wish to add my repository as a remote: |
− | 214 0.3910 x264 __aeabi_fdiv
| + | |
− | 209 0.3818 x264 x264_pixel_var2_8x8_neon
| + | <code>git remote add socis git://github.com/samdunne/x264.git</code> |
− | 204 0.3727 x264 __aeabi_fadd
| + | |
− | 201 0.3672 x264 deblock_strength_c
| + | <br> |
− | 197 0.3599 x264 x264_pixel_avg2_w20_neon
| + | |
− | 197 0.3599 x264 x264_pixel_avg_w16_neon
| + | <br> |
− | 187 0.3416 x264 x264_mc_copy_w16_aligned_neon
| + | ==ARM Assembly Guide== |
− | 184 0.3362 x264 x264_cabac_mb_type
| + | Something I have been working on so far: |
− | 181 0.3307 x264 load_deinterleave_8x8x2_fenc
| + | http://www.mediafire.com/?bgud9hqqzod6tra |
− | 178 0.3252 x264 x264_macroblock_write_cabac
| + | <br> |
− | 177 0.3234 x264 x264_mb_mc_01xywh
| + | |
− | 168 0.3069 x264 x264_mb_mc_0xywh
| + | |
− | 167 0.3051 x264 mc_luma_neon
| + | == Contact == |
− | 165 0.3015 x264 x264_pixel_ssd_16x16_neon
| + | |
− | 161 0.2941 x264 x264_quant_dc_trellis
| + | === Sam Dunne === |
− | 159 0.2905 x264 x264_pixel_avg_w8_neon
| + | *I can be contact on IRC [<u>''Nick = SCD101''</u>] |
− | 156 0.2850 x264 x264_mc_copy_w8_neon
| + | *You can also email me on [mailto:sam.dunne101@gmail.com sam.dunne101@gmail.com] |
− | 156 0.2850 x264 x264_pixel_satd_16x16_neon
| + | |
− | 155 0.2832 x264 x264_mb_analyse_inter_p16x16
| + | {{GSoC}} |
− | 151 0.2759 x264 x264_pixel_ssd_8x8_neon
| + | |
− | 150 0.2740 x264 memcpy_aligned_8_16_neon
| + | [[Category:SoC]] |
− | 148 0.2704 x264 __aeabi_fmul
| + | [[Category:x264]] |
− | 143 0.2613 x264 x264_mb_encode_i4x4
| |
− | 134 0.2448 x264 mbtree_propagate_cost
| |
− | 124 0.2265 x264 x264_pixel_sad_8x16_neon
| |
− | 112 0.2046 x264 x264_mc_weight_w16_offsetsub_neon
| |
− | 111 0.2028 x264 x264_pixel_sad_x4_8x16_neon
| |
− | 110 0.2010 x264 x264_mb_predict_mv_direct16x16
| |
− | 108 0.1973 x264 x264_frame_init_lowres_core_neon
| |
− | 98 0.1790 x264 x264_plane_copy_interleave_c
| |
− | 93 0.1699 x264 block_residual_write_cabac
| |
− | 91 0.1663 x264 __floatsisf
| |
− | 88 0.1608 x264 x264_mb_mc
| |
− | 87 0.1589 x264 x264_cabac_mb_ref
| |
− | 87 0.1589 x264 x264_dequant_4x4_neon
| |
− | 86 0.1571 x264 __aeabi_l2f
| |
− | 86 0.1571 x264 x264_satd_4x8_8x4_end_neon
| |
− | 85 0.1553 x264 x264_pixel_var_16x16_neon
| |
− | 82 0.1498 x264 store_interleave_8x8x2
| |
− | 82 0.1498 x264 x264_ratecontrol_mb_qp
| |
− | 80 0.1462 x264 x264_pixel_sad_x4_16x8_neon
| |
− | 79 0.1443 x264 x264_mb_predict_mv_16x16
| |
− | 79 0.1443 x264 x264_pixel_sad_16x8_neon
| |
− | 76 0.1389 x264 x264_mc_weight_w8_offsetsub_neon
| |
− | 72 0.1315 x264 x264_prefetch_fenc_arm
| |
− | 71 0.1297 x264 x264_mb_analyse_b_rd
| |
− | 70 0.1279 x264 x264_add8x4_idct_neon
| |
− | 70 0.1279 x264 x264_frame_deblock_row
| |
− | 70 0.1279 x264 x264_pixel_sad_x3_8x16_neon
| |
− | 68 0.1242 x264 x264_pixel_hadamard_ac_16x16_neon
| |
− | 61 0.1114 x264 x264_coeff_last16_neon
| |
− | 57 0.1041 x264 deblock_v_chroma_c
| |
− | 56 0.1023 x264 x264_predict_16x16_h_c
| |
− | 55 0.1005 x264 x264_predict_4x4_hd_c
| |
− | 54 0.0987 x264 x264_predict_8x8_vr_c
| |
− | 53 0.0968 x264 x264_mb_encode_i16x16
| |
− | 53 0.0968 x264 x264_predict_4x4_vl_c
| |
− | 52 0.0950 x264 x264_hpel_filter_c_neon
| |
− | 52 0.0950 x264 x264_hpel_filter_v_neon
| |
− | 52 0.0950 x264 x264_predict_4x4_vr_c
| |
− | 52 0.0950 x264 x264_predict_8x8_filter_c
| |
− | 50 0.0913 x264 x264_mb_analyse_intra_chroma
| |
− | 50 0.0913 x264 x264_mb_analyse_p_rd
| |
− | 50 0.0913 x264 x264_ratecontrol_mb
| |
− | 49 0.0895 x264 x264_mb_mc_8x8
| |
− | 49 0.0895 x264 x264_predict_8x8_hd_c
| |
− | 47 0.0859 x264 memcpy_aligned_16_16_neon
| |
− | 47 0.0859 x264 x264_mb_mc_1xywh
| |
− | 46 0.0840 x264 x264_pixel_satd_16x8_neon
| |
− | 45 0.0822 x264 x264_cabac_encode_terminal_c
| |
− | 45 0.0822 x264 x264_cabac_mb_mvd
| |
− | 45 0.0822 x264 x264_pixel_sad_x3_16x8_neon
| |
− | 45 0.0822 x264 x264_zigzag_scan_4x4_frame_neon
| |
− | 44 0.0804 x264 deblock_h_chroma_c
| |
− | 44 0.0804 x264 x264_predict_8x8_vl_c
| |
− | 44 0.0804 x264 x264_predict_8x8c_p_neon
| |
− | 43 0.0786 x264 x264_pixel_satd_16x16
| |
− | 43 0.0786 x264 x264_predict_8x8c_dc_c
| |
− | 41 0.0749 x264 x264_macroblock_deblock_strength
| |
− | 40 0.0731 x264 x264_copy_column8
| |
− | 40 0.0731 x264 x264_memcpy_aligned_neon
| |
− | 38 0.0694 x264 x264_predict_8x8_ddl_c
| |
− | 38 0.0694 x264 x264_predict_8x8_ddr_c
| |
− | 37 0.0676 x264 x264_mb_analyse_inter_b8x16
| |
− | 37 0.0676 x264 x264_mc_copy_w16_neon
| |
− | 36 0.0658 x264 x264_deblock_h_luma_neon
| |
− | 36 0.0658 x264 x264_predict_16x16_v_c
| |
− | 36 0.0658 x264 x264_predict_4x4_hu_c
| |
− | 36 0.0658 x264 x264_sub4x4_dct_neon
| |
− | 36 0.0658 x264 x264_sub8x8_dct_dc_neon
| |
− | 35 0.0639 x264 x264_intra_satd_x3_4x4
| |
− | 35 0.0639 x264 x264_mb_analyse_inter_b16x8
| |
− | 35 0.0639 x264 x264_me_refine_bidir_satd
| |
− | 35 0.0639 x264 x264_pixel_satd_4x8_neon
| |
− | 34 0.0621 x264 x264_cabac_mb_type
| |
− | 34 0.0621 x264 x264_predict_16x16_dc_c
| |
− | 33 0.0603 x264 x264_hpel_filter_h_neon
| |
− | 33 0.0603 x264 x264_pixel_satd_8x16_neon
| |
− | 32 0.0585 x264 x264_frame_expand_border_lowres
| |
− | 31 0.0566 x264 x264_predict_4x4_ddr_armv6
| |
− | 29 0.0530 x264 x264_macroblock_probe_skip
| |
− | 29 0.0530 x264 x264_mc_weight_w8_neon
| |
− | 29 0.0530 x264 x264_predict_16x16_p_neon
| |
− | 28 0.0512 x264 memcpy_aligned_8_8_neon
| |
− | 28 0.0512 x264 x264_mb_predict_mv_pskip
| |
− | 28 0.0512 x264 x264_predict_8x8_hu_c
| |
− | 28 0.0512 x264 x264_sub16x16_dct_neon
| |
− | 27 0.0493 x264 x264_me_refine_qpel_refdupe
| |
− | 27 0.0493 x264 x264_pixel_avg_w4_neon
| |
− | 26 0.0475 x264 __fixsfsi
| |
− | 26 0.0475 x264 x264_add4x4_idct_neon
| |
− | 26 0.0475 x264 x264_cabac_mb_ref
| |
− | 26 0.0475 x264 x264_intra_satd_x3_8x8c
| |
− | 24 0.0438 x264 x264_intra_satd_x3_16x16
| |
− | 24 0.0438 x264 x264_pixel_satd_8x4_neon
| |
− | 24 0.0438 x264 x264_quant_2x2_dc_neon
| |
− | 23 0.0420 x264 x264_weight_cost_luma
| |
− | 22 0.0402 x264 x264_predict_8x8c_h_c
| |
− | 21 0.0384 x264 __aeabi_fcmpgt
| |
− | 21 0.0384 x264 x264_predict_4x4_dc_c
| |
− | 20 0.0365 x264 x264_ac_energy_mb
| |
− | 20 0.0365 x264 x264_slicetype_frame_cost
| |
− | 19 0.0347 x264 x264_cabac_encode_bypass_c
| |
− | 19 0.0347 x264 x264_deblock_v_luma_neon
| |
− | 18 0.0329 x264 memcpy_aligned_16_8_neon
| |
− | 17 0.0311 x264 x264_decimate_score15
| |
− | 17 0.0311 x264 x264_intra_rd
| |
− | 16 0.0292 x264 x264_cabac_mb_skip
| |
− | 16 0.0292 x264 x264_var_end
| |
− | 15 0.0274 x264 __cmpsf2
| |
− | 15 0.0274 x264 x264_predict_4x4_h_c
| |
− | 14 0.0256 x264 x264_pixel_avg_8x8_neon
| |
− | 14 0.0256 x264 x264_pixel_avg_weight_w16_add_add_neon
| |
− | 13 0.0238 x264 deblock_v_luma_intra_c
| |
− | 13 0.0238 x264 x264_frame_expand_border
| |
− | 13 0.0238 x264 x264_mc_weight_w8_offsetadd_neon
| |
− | 13 0.0238 x264 x264_predict_4x4_ddl_neon
| |
− | 12 0.0219 x264 x264_frame_expand_border_filtered
| |
− | 12 0.0219 x264 x264_memzero_aligned_neon
| |
− | 12 0.0219 x264 x264_pixel_var_8x8_neon
| |
− | 11 0.0201 x264 x264_mb_cache_mv_b16x8
| |
− | 11 0.0201 x264 x264_predict_4x4_v_c
| |
− | 10 0.0183 x264 x264_cabac_encode_ue_bypass
| |
− | 10 0.0183 x264 x264_macroblock_cache_load_neighbours_deblock
| |
− | 9 0.0164 x264 idct_dequant_2x2_dconly
| |
− | 9 0.0164 x264 x264_mb_analyse_transform_rd
| |
− | 9 0.0164 x264 x264_pixel_avg_weight_w8_add_add_neon
| |
− | 9 0.0164 x264 x264_predict_4x4_dc_armv6
| |
− | 8 0.0146 x264 x264_prefetch_ref_arm
| |
− | 7 0.0128 x264 x264_coeff_last15_neon
| |
− | 7 0.0128 x264 x264_prefetch_fenc
| |
− | 6 0.0110 x264 x264_add8x8_idct_dc_neon
| |
− | 6 0.0110 x264 x264_add8x8_idct_neon
| |
− | 6 0.0110 x264 x264_pixel_avg_16x16_neon
| |
− | 6 0.0110 x264 x264_predict_8x8c_dc_neon
| |
− | 6 0.0110 x264 x264_weight_scale_plane
| |
− | 5 0.0091 x264 __aeabi_cfrcmple
| |
− | 5 0.0091 x264 x264_adaptive_quant_frame
| |
− | 5 0.0091 x264 x264_macroblock_tree_finish
| |
− | 5 0.0091 x264 x264_mb_cache_mv_b8x16
| |
− | 5 0.0091 x264 x264_pixel_avg_4x4_neon
| |
− | 5 0.0091 x264 x264_predict_8x8c_v_c
| |
− | 4 0.0073 x264 __aeabi_ui2f
| |
− | 4 0.0073 x264 deblock_h_chroma_intra_c
| |
− | 4 0.0073 x264 deblock_h_luma_intra_c
| |
− | 4 0.0073 x264 x264_fdec_filter_row
| |
− | 4 0.0073 x264 x264_frame_init_lowres
| |
− | 4 0.0073 x264 x264_predict_16x16_h_neon
| |
− | 4 0.0073 x264 x264_predict_4x4_h_armv6
| |
− | 3 0.0055 x264 __aeabi_cfcmple
| |
− | 3 0.0055 x264 __divdf3
| |
− | 3 0.0055 x264 deblock_v_chroma_intra_c
| |
− | 3 0.0055 x264 x264_dequant_4x4_dc_neon
| |
− | 3 0.0055 x264 x264_encoder_encode
| |
− | 3 0.0055 x264 x264_frame_filter
| |
− | 3 0.0055 x264 x264_nal_escape_c
| |
− | 3 0.0055 x264 x264_pixel_avg_8x16_neon
| |
− | 3 0.0055 x264 x264_predict_16x16_dc_neon
| |
− | 3 0.0055 x264 x264_rc_analyse_slice
| |
− | 2 0.0037 libpthread-2.9.so /lib/libpthread-2.9.so
| |
− | 2 0.0037 x264 __subsf3
| |
− | 2 0.0037 x264 x264_analyse_init_costs
| |
− | 2 0.0037 x264 x264_coeff_last4_arm
| |
− | 2 0.0037 x264 x264_encoder_frame_end
| |
− | 2 0.0037 x264 x264_predict_16x16_dc_top_neon
| |
− | 2 0.0037 x264 x264_quant_4x4_dc_neon
| |
− | 2 0.0037 x264 x264_sub8x8_dct_neon
| |
− | 2 0.0037 x264 x264_weight_cost_init_luma
| |
− | 1 0.0018 libm-2.9.so /lib/libm-2.9.so
| |
− | 1 0.0018 x264 __aeabi_d2f
| |
− | 1 0.0018 x264 __aeabi_f2d
| |
− | 1 0.0018 x264 __aeabi_fcmplt
| |
− | 1 0.0018 x264 __aeabi_uidivmod
| |
− | 1 0.0018 x264 __cmpdf2
| |
− | 1 0.0018 x264 __divdi3
| |
− | 1 0.0018 x264 __muldf3
| |
− | 1 0.0018 x264 __udivdi3
| |
− | 1 0.0018 x264 bs_write_ue_big
| |
− | 1 0.0018 x264 hpel_filter_neon
| |
− | 1 0.0018 x264 optimize_chroma_dc
| |
− | 1 0.0018 x264 x264_add16x16_idct_dc_neon
| |
− | 1 0.0018 x264 x264_dct4x4dc_neon
| |
− | 1 0.0018 x264 x264_frame_copy_picture
| |
− | 1 0.0018 x264 x264_frame_push_unused
| |
− | 1 0.0018 x264 x264_free
| |
− | 1 0.0018 x264 x264_macroblock_cache_mv_4_2
| |
− | 1 0.0018 x264 x264_macroblock_slice_init
| |
− | 1 0.0018 x264 x264_pixel_avg_4x8_neon
| |
− | 1 0.0018 x264 x264_pixel_avg_8x4_neon
| |
− | 1 0.0018 x264 x264_predict_16x16_v_neon</pre>
| |