• Home
  • Services
    • HR Services
      • HR Advisory Services
      • Contract Staffing
      • HR Outsourcing Services
      • Manpower Supply Services
      • Payroll Processing
      • Permanent Placement
      • Recruitment and Placement Services
      • Recruitment Process Outsourcing
      • Staffing Agency Services
    • DBA Support
      • DBA Consultancy Services
      • PostgreSQL Support
    • Website Maintenance
    • Company Registration Services
    • Virtual Office Space Address
  • Company
    • FAQs
    • About Us
    • Contact
  • Office Space
    • Virtual Space
    • Co-working Space
    • Private Space
    • Meeting Room
    • Locations
    • Add Listing
    • Dashboard
  • Blogs
  • Careers
    • Jobseeker
    • Employer
  • Courses

    About Courses

    • List Of Courses
    • Become an Instructor
    Greenplum

    Greenplum

    $1,500.00
    Read More
    Have any question?
    (+91)8148383856
    info@rayafeel.com
    Login
    RayaFeeL
    • Home
    • Services
      • HR Services
        • HR Advisory Services
        • Contract Staffing
        • HR Outsourcing Services
        • Manpower Supply Services
        • Payroll Processing
        • Permanent Placement
        • Recruitment and Placement Services
        • Recruitment Process Outsourcing
        • Staffing Agency Services
      • DBA Support
        • DBA Consultancy Services
        • PostgreSQL Support
      • Website Maintenance
      • Company Registration Services
      • Virtual Office Space Address
    • Company
      • FAQs
      • About Us
      • Contact
    • Office Space
      • Virtual Space
      • Co-working Space
      • Private Space
      • Meeting Room
      • Locations
      • Add Listing
      • Dashboard
    • Blogs
    • Careers
      • Jobseeker
      • Employer
    • Courses

      About Courses

      • List Of Courses
      • Become an Instructor
      Greenplum

      Greenplum

      $1,500.00
      Read More

      Blog

      • Home
      • Blog
      • Blog
      • Tips to increase PostgreSQL performance

      Tips to increase PostgreSQL performance

      • Posted by bushra.rayafeel
      • Categories Blog
      • Date May 28, 2022
      • Comments 0 comment

      1. Avoid join if possible use subquery instead of join

      Note : subqueries can be slower than LEFT [OUTER] JOINS, but in my opinion their strength is slightly higher readability.

      select GPA from Student where sID in (select sID from Apply where major = 'CS');
      Don't use Like following :
      select GPA from Student, Apply where Student.sID = Apply.sID and Apply.major = 'CS';

      Some Other Example :

      SELECT id, alias
      FROM
        mos_categories
      WHERE
        id IN (
          SELECT
            DISTINCT catid
          FROM mos_content
        );
      
      ~ 5 seconds
      
      SELECT
        DISTINCT mos_content.catid,
        mos_categories.alias
      FROM
        mos_content, mos_categories
      WHERE
        mos_content.catid = mos_categories.id;
      
      ~ 15 seconds

      2.If you need to join the table , join should be small table to big table, before that developer need to check the table size using following query.

      SELECT pg_size_pretty(pg_relation_size('schema_name.tablename'));

      3. PostgreSQL Multicolumn INDEX ORDER Should be Following Order :

      Let’s review the explain analyze plan of the following simple query without indexes:

      EXPLAIN ANALYZE SELECT line_items.product_id, SUM(line_items.price)
      FROM line_items
      WHERE product_id > 80
      GROUP BY 1
      EXPLAIN PLAN :
      HashAggregate (cost=13.81..14.52 rows=71 width=12) (actual time=0.137..0.141 rows=20 loops=1)
      Group Key: product_id
      -> Seq Scan on line_items (cost=0.00..13.25 rows=112 width=8) (actual time=0.017..0.082 rows=112 loops=1)
      Filter: (product_id > 80)
      Rows Removed by Filter: 388
      Planning time: 0.082 ms
      Execution time: 0.187 ms

      AFTER CREATING INDEX :

      CREATE INDEX items_product_id_price ON line_items(product_id, price) ;

      EXPLAIN PLAN :

      GroupAggregate (cost=0.27..7.50 rows=71 width=12) (actual time=0.034..0.090 rows=20 loops=1)
      Group Key: product_id
      -> Index Only Scan using items_product_id_price on line_items (cost=0.27..6.23 rows=112 width=8) (actual time=0.024..0.049 rows=112 loops=1)
      Index Cond: (product_id > 80)
      Heap Fetches: 0
      Planning time: 0.271 ms
      Execution time: 0.136 ms

      CREATING INDEX WITH REVEARSE ORDER :

      CREATE INDEX items_product_id_price_reversed ON line_items(price, product_id);

      It is equal to full table sequnce scan so don’t create index like this.

      OTHER EXAMPLE :

      SELECT orders.product_id, SUM(line_items.price)
      FROM line_items
      LEFT JOIN orders ON line_items.order_id = orders.id
      WHERE line_items.created_at BETWEEN '2018-01-01' and '2018-01-02'
      GROUP BY 1

      Here we have join on order_id and filter on created_at. This way, we can create a multicolumn index that will contain created_at in the first place, order_id in the second and price in the third:

      CREATE INDEX line_items_created_at_order_id_price ON line_items(created_at, order_id, price);

      We’ll get the following explain plan:

      GroupAggregate (cost=12.62..12.64 rows=1 width=12) (actual time=0.029..0.029 rows=1 loops=1)
      Group Key: orders.product_id
      -> Sort (cost=12.62..12.62 rows=1 width=8) (actual time=0.025..0.026 rows=1 loops=1)
      Sort Key: orders.product_id
      Sort Method: quicksort Memory: 25kB
      -> Nested Loop Left Join (cost=0.56..12.61 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1)
      -> Index Only Scan using line_items_created_at_order_id_price on line_items (cost=0.27..4.29 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
      Index Cond: ((created_at >= '2018-01-01 00:00:00'::timestamp without time zone) AND (created_at <= '2018-01-02 00:00:00'::timestamp without time zone))
      Heap Fetches: 0
      -> Index Scan using orders_pkey on orders (cost=0.29..8.30 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1)
      Index Cond: (line_items.order_id = id)
      Planning time: 0.303 ms
      Execution time: 0.072 ms

      As you can see, line_items_created_at_order_id_price is used to reduce scan by date condition. After that, it’s joined with orders using the orders_pkey index scan.

      Note : Gin index faster SELECT AND gist FASTER UPDATE and Btree for Default index
      First try to create default index then note the Execution time after that try GIN and GIST index
      note the execution time which one index is getting good performance keep that index only.
      for taking execution plan please Refer https://www.tutorialdba.com/p/postgresql-explain-plan.html

      4. while using Where CLAUSE, Use IN class instead of using OR class

      with OR :
      SELECT id FROM a
      WHERE id = 42
         OR id = 4711;
      
      WITHOUT OR:
      SELECT id FROM a
      WHERE id IN (42, 4711);

      5. INSTEAD OF USING DOUBLE LIKE USE “ANY LIKE” OPTION

      SELECT id FROM tutorialdba WHERE a_val LIKE 'something%' OR a_val LIKE 'other%';
      
      --REWRITING LIKE FOLLOWING:
      SELECT id FROM tutorialdba WHERE a_val LIKE ANY (ARRAY['something%', 'other%']);

      6. DO ReINDEX Query Unused Indexes :

      Index all the things right? Did you know having unused indexes can negatively affect write performance? The reason is, when you create an index, Postgres is burdened with the task of keeping this index updated after write (INSERT / UPDATE / DELETE) operations. So, adding an index is a balancing act because they can speed up reading of data (if created properly) but will slow down write operations. To find unused indexes you can run the following command.

      SELECT
        indexrelid::regclass as index,
        relid::regclass as table,
        'DROP INDEX ' || indexrelid::regclass || ';' as drop_statement
      FROM
        pg_stat_user_indexes
        JOIN
          pg_index USING (indexrelid)
      WHERE
        idx_scan = 0
        AND indisunique is false;

      7. TO CHECKING INDEX STATUS :

      --to checking table index:
      postgres=# select relname,indexrelname,indexrelid from pg_stat_all_indexes where indexrelid=24700;
       relname | indexrelname | indexrelid 
      ---------+--------------+------------
       k       | production   |      24700 
      --check the index is valid or not and check the index is alive or not
      postgres=# select indexrelid,indislive,indisvalid from pg_index where indexrelid=24700;
       indexrelid | indislive | indisvalid 
      ------------+-----------+------------
            24700 | t         | t
      (1 row)         

      MANUALLY REINDEXING IF VALUE IS FALSE :

      DROP INDEX registration.rg_patient_ix10;
      CREATE INDEX rg_patient_ix10 ON registration.patient USING btree( lower(firstname), lower(middlename), lower(lastname) varchar_pattern_ops);

      8. Before and after changing the query take the explain plan and note down the query run time using explain plan

      Example:
      
      Before changing Query functionality:
      EXPLAIN ANALYZE Query; note the Execution time
      
      After changing Query functionality:
      EXPLAIN ANALYZE Query; note the Execution time

      9. CHECKING THE DEAD TUBLES :

      IF U GET ticket regarding for query tuning you have to check the table whether is having any dead tubles or not , if dead tuble is exists you have to do vacuum .

      --to checking table index:
      postgres=# select relname,indexrelname,indexrelid from pg_stat_all_indexes where indexrelid=24700;
       relname | indexrelname | indexrelid 
      ---------+--------------+------------
       k       | production   |      24700 
      --check the index is valid or not and check the index is alive or not
      postgres=# select indexrelid,indislive,indisvalid from pg_index where indexrelid=24700;
       indexrelid | indislive | indisvalid 
      ------------+-----------+------------
            24700 | t         | t
      (1 row)                       

      10. CHECKING LONG RUNNING QUERY :

      THIS SCRIPT WILL HELPFUL FOR CHECKING THE LONG RUNNING QUERY AS WELL DISK USUAGE AND MONITORING RAM SIZE DEAD TUBLES AND ETC.. .

      [root@tutorialdba.com]# cat TUTORIALDBAmonitor.sh
      echo " "
      echo " "
      HOSTNAME="101.0.1.123"
      PSQL="/opt/edb/as9.6/bin/psql"
      PORT=5444
      DB="edb"
      USER="enterprisedb"
      export PGPASSWORD="rayafeel.com" 
      
      echo "ENTER THE TIME IN MINUTES ?"
      echo "For example if you give 10 means it will display "
      echo "queries which is running more than 10 minutes"
      echo "0 means will display all queries:"
      echo ""
      read time
      echo " "
      echo " "
      
      
      echo "WHAT ARE THE QUERY IS RUNING MORE THAN $time MINUTES"
      $PSQL -d $DB -U $USER -p $PORT -h $HOSTNAME <<EOF
      \pset format wrapped
      SELECT pid, now() - query_start as "runtime", usename, datname, state, query
        FROM  pg_stat_activity
        WHERE now() - query_start > '$time minutes'::interval
       ORDER BY runtime DESC;
      
      EOF
      echo "*********************COMPLETED 1****************************** "
      echo " "
      echo " "
      echo " "
      
      echo "CHECKING dISK SPACE:"
              df -h
      
      echo "*********************COMPLETED 2****************************** "
      echo " "
      echo " "
      echo " "
      
      
      
      echo "CHECKING RAM USAGE:"
             free -h
      
      echo "*********************COMPLETED 3****************************** "
      echo " "
      echo " "
      echo " "
      
      
      
      echo "ENTER THE BLOATED COUNT ? "
      echo "For example if you give 1000  means it will display the table count"
      echo "how many table having more than 1000 dead tubles that database:"
      echo ""
      read count
      echo " "
      echo " "
      
      
      
      echo "HOW MANY TABLES HAVING MORE THAN $count BLOATED TABLE :"
      
      $PSQL -d $DB -U $USER -p $PORT -h $HOSTNAME  <<EOF
      \c LC
      select count(*) from pg_stat_all_tables where n_dead_tup > $count;
      EOF
      
      echo " "
      
      $PSQL -d $DB -U $USER -p $PORT -h $HOSTNAME  <<EOF
      \c DC
      select count(*) from pg_stat_all_tables where n_dead_tup >$count;
      EOF
      
      echo "*********************COMPLETED 4****************************** "
      echo " "
      echo " "
      echo " "
      
      
      echo "ENTER THE BLOATED COUNT AGAING ? "
      echo "it will give schema,table name and dead tuble count including "
      echo "with vacuum & analyze script you can copy and run it to the   "
      echo "SQL prompt what are the table having more Bloated table "
      echo ""
      read count1
      echo ""
      echo ""
      
      echo "WHAT ARE THE TABLE HAVING MORE THAN $count1 BLOATED TABLE : "
      
      $PSQL -d $DB -U $USER -p $PORT -h $HOSTNAME  <<EOF
      \c LC
      select 'vacuum ANALYZE '||schemaname||'.'||relname||';' from pg_stat_all_tables where n_dead_tup >$count1;
      select schemaname,relname,n_dead_tup from pg_stat_all_tables where n_dead_tup >$count1;
      EOF
      
      echo " "
      
      $PSQL -d $DB -U $USER -p $PORT -h $HOSTNAME  <<EOF
      \c DC
      select 'vacuum ANALYZE '||schemaname||'.'||relname||';' from pg_stat_all_tables where n_dead_tup >$count1;
      select schemaname,relname,n_dead_tup from pg_stat_all_tables where n_dead_tup >$count1;
      EOF
      
      echo "*********************COMPLETED 5****************************** "
      echo " "
      

      SAMPLE OUTPUT:

      [root@tutorialdba.com]#  sh TUTORIALDBAmonitor.sh
      
      ENTER THE TIME IN MINUTES ?
      For example if you give 10 means it will display
      queries which is running more than 10 minutes
      0 means will display all queries:
      
      4
      
      
      WHAT ARE THE QUERY IS RUNING MORE THAN 4 MINUTES
      Output format is wrapped.
        pid  |     runtime     | usename  |    datname     | state |   query
      -------+-----------------+----------+----------------+-------+------------
       15832 | 00:05:57.796746 | hr       | LC             | idle  | unlisten *
       15465 | 00:05:35.23556  | ehis     | LC 	     | idle  | unlisten *
       15113 | 00:05:34.880537 | billing  | LC	     | idle  | unlisten *
       15918 | 00:04:50.816381 | security | LC	     | idle  | unlisten *
       15930 | 00:04:48.148488 | billing  | DC 	     | idle  | unlisten *
      ...
      ..
      (10 rows)
      
      *********************COMPLETED 1******************************
      
      
      
      CHECKING dISK SPACE:
      Filesystem      Size  Used Avail Use% Mounted on
      /dev/xvda2      250G  2.0G  249G   1% /
      devtmpfs         16G     0   16G   0% /dev
      tmpfs            16G  8.0K   16G   1% /dev/shm
      tmpfs            16G  177M   16G   2% /run
      tmpfs            16G     0   16G   0% /sys/fs/cgroup
      /dev/xvdb       200G   33G  168G  17% /opt
      tmpfs           3.2G     0  3.2G   0% /run/user/1000
      *********************COMPLETED 2******************************
      
      
      
      CHECKING RAM USAGE:
                    total        used        free      shared  buff/cache   available
      Mem:            31G        717M         27G        1.3G        2.7G         28G
      Swap:            0B          0B          0B
      *********************COMPLETED 3******************************
      
      
      
      ENTER THE BLOATED COUNT ?
      For example if you give 1000  means it will display the table count
      how many table having more than 1000 dead tubles that database:
      
      1500
      
      
      HOW MANY TABLES HAVING MORE THAN 1500 BLOATED TABLE :
      You are now connected to database "LC" as user "enterprisedb".
       count
      -------
           3
      (1 row)
      
      
      You are now connected to database "DC" as user "enterprisedb".
       count
      -------
           1
      (1 row)
      
      *********************COMPLETED 4******************************
      
      
      
      ENTER THE BLOATED COUNT AGAING ?
      it will give schema,table name and dead tuble count including
      with vacuum & analyze script you can copy and run it to the
      SQL prompt what are the table having more Bloated table
      
      2000
      
      
      WHAT ARE THE TABLE HAVING MORE THAN 2000 BLOATED TABLE :
      You are now connected to database "LC" as user "enterprisedb".
                      ?column?
      -----------------------------------------
       vacuum ANALYZE pg_catalog.pg_attribute;
       vacuum ANALYZE pg_catalog.pg_depend;
       vacuum ANALYZE pg_catalog.pg_shdepend;
      (3 rows)
      
       schemaname |   relname    | n_dead_tup
      ------------+--------------+------------
       pg_catalog | pg_attribute |      21218
       pg_catalog | pg_depend    |       9869
       pg_catalog | pg_shdepend  |       2730
      (3 rows)
      
      
      You are now connected to database "DC" as user "enterprisedb".
                      ?column?
      ----------------------------------------
       vacuum ANALYZE pg_catalog.pg_shdepend;
      (1 row)
      
       schemaname |   relname   | n_dead_tup
      ------------+-------------+------------
       pg_catalog | pg_shdepend |       2730
      (1 row)
      
      *********************COMPLETED 5******************************

      vacuum and analyze the catalog table which is having more dead tuble, Here pg_attribute havig more dead tuble that is what am chose it and vacuum and analyze that.

      LC=# vacuum ANALYZE pg_catalog.pg_attribute;
      VACUUM
      1. PostgreSQL Parameter Tuning :
      • max_connections = <num>— This option sets the maximum number of database backend to have at any one time. Use this feature to ensure that you do not launch so many backends that you begin swapping to disk and kill the performance of all the children. Depending on your application it may be better to deny the connection entirely rather than degrade the performance of all of the other children.

      max_connections =1000 # connection means , Total Memory taken16GB= work_memX1000 (consider 16MB is work_mem)

       

      • shared_buffers = <num>— Editing this option is the simplest way to improve the performance of your database server. The default is pretty low for most modern hardware. General wisdom says that this should be set to roughly 25% of available RAM on the system. Like most of the options I will outline here you will simply need to try them at different levels (both up and down ) and see how well it works on your particular system. Most people find that setting it larger than a third starts to degrade performance.

      shared_buffers = 25% of System Memory

       

      • effective_cache_size = <num>— This value tells PostgreSQL’s optimizer how much memory PostgreSQL has available for caching data and helps in determing whether or not it use an index or not. The larger the value increases the likely hood of using an index. This should be set to the amount of memory allocated to shared_buffers plus the amount of OS cache available. Often this is more than 50% of the total system memory.

      effective_cache_size =50% of System Memory

       

      • work_mem = <num>— This option is used to control the amount of memory using in sort operations and hash tables. While you may need to increase the amount of memory if you do a ton of sorting in your application, care needs to be taken. This isn’t a system wide parameter, but a per operation one. So if a complex query has several sort operations in it it will use multiple work_mem units of memory. Not to mention that multiple backends could be doing this at once. This query can often lead your database server to swap if the value is too large. This option was previously called sort_mem in older versions of PostgreSQL.

                 work_mem = 16mb (its depend on backend operation, increase the size while executing the slow running query or if taken more cost queries like following  )

      set work_mem='1MB';
      select ...; // running time is ~2000 msset work_mem='96MB';
      select ...' // running time is ~1500 ms

      Note : Its session level only not a server level so you can increase while running BIG type of  sorting .

       

      • wall_buffer =<Num> Should be 3%  of shared buffer , If Your postgres server is running more  and more insert , update, delete statement running means you have to increase  as 8mb…16mb

       

      • max_fsm_pages = <num>— This option helps to control the free space map. When something is deleted from a table it isn’t removed from the disk immediately, it is simply marked as “free” in the free space map. The space can then be reused for any new INSERTs that you do on the table. If your setup has a high rate of DELETEs and INSERTs it may be necessary increase this value to avoid table bloat.

       

      • fsync = <boolean>— This option determines if all your WAL pages are fsync()’ed to disk before a transactions is committed. Having this on is safer, but can reduce write performance. If fsync is not enabled there is the chance of unrecoverable data corruption. Turn this off at your own risk.

       

      • commit_delay = <num>and commit_siblings = <num> Two parameters I don’t recommend touching in most cases are commit_siblings and commit_delay, artifacts of an older attempt to improve performance on systems with slow commit times (which includes most systems that don’t have a battery-backed write cache for accelerating that area). Nowadays turning off the synchronous_commit

      default commit_delay is zero (no delay)

      random_page_cost = <num> — random_page_cost controls the way PostgreSQL views non-sequential disk reads. A higher value makes it more likely that a sequential scan will be used over an index scan indicating that your server has very fast disks.

      If this is still confusing to you, Rayafeel does offer a PostgreSQL Tuning Service

      Note that many of these options consume shared memory and it will probably be necessary to increase the amount of shared memory allowed on your system to get the most out of these options.

      1. PostgreSQL Connection Pooling :

      There are several ways to accomplish database connection pooling, here is a short list of common ones:

      • Pgpoolis a small server that you run on the same server as your clients that will pool database connections to some local or remote server. The application simply points at the pgpool instance instead of the normal postmaster. From the application’s perspective nothing has changed as the connection pooling is hidden from it.
      • In a mod_perlenvironment you can use Apache::DBI to handle database connection pooling inside of Apache itself.
      • SQLRelayis another db connection manager that is somewhat database agnostic. It works with with several databases other than PostgreSQL.
      • You can always write a small bit of code to do this for you yourself, but I would highly recommend using an already developed solution to reduce the amount of debugging you have to do.

      Tag:performance, postgresql

      • Share:
      bushra.rayafeel

      Previous post

      Taking Backup of postgres Objects (DDL) Data definition with individual files
      May 28, 2022

      Next post

      How to Rebuild the standalone postgres slave server ?
      May 28, 2022

      Leave A Reply

      You must be logged in to post a comment.

      Connect with



      Search

      ADVERTISEMENT

      Latest Posts

      HOW VIRTUAL OFFICE IS GOING TO CHANGE YOUR BUSINESS STRATEGIES
      21Jun2022
      Take the best advantage of your Virtual Office
      21Jun2022
      Let’s discuss how corporates can improve their productivity through virtual office spaces?
      21Jun2022
      Virtual Office Space & Address Services
      21Jun2022
      HOW VIRTUAL OFFICES ARE TRANSFORMING BUSINESSES
      21Jun2022
      PostgreSQL

      PostgreSQL

      $800.00 $500.00
      Greenplum

      Greenplum

      $1,500.00
      Oracle Database

      Oracle Database

      $350.00
      2ndquadrant.in

      (+91) 8838953252

      ITsupport@rayafeel.com

      Company

      • About Us
      • Our Team

      COURSES

      • List Of Course
      • Become An Instructor

      Support

      • DBA Support
      • Consultancy Services

      Recommend

      • Login
      • Contact Us

      IT Services by rayafeel.com. Powered by Rayafeel Technologies Pvt Ltd.

      • Privacy
      • Terms

      Become An Instructor?

      Join thousand of instructors and earn money hassle free!

      Get Started Now

      Connect with

      Login with Google

      Login with your site account

      Connect with



      Lost your password?