Oracle Dataguard in 5 Minutes

ok, although I wrote that there are plenty of websites and blogs describing how to create a dataguard standby, I’m doing one on my own… specially because my colleague asked me to write them a manual, so I decided to do it here.

In my example I’m using two Servers with Oracle 12.2 clusterware (Oracle restart) including an ASM instance and a 12.2 db home. On the first server – lets call it the primary node – I’ve also installed a non-CDB Database with the database create assistant (dbca). The name of the database is ORCL, the db_unique_name is ORCLRA (the db_unique_name is quite important for dataguard to distinguish the different sites)

  • Prepare the source

there are some mandatory things that must be checked or adjusted before we can start to create the standby… the database must be in archivelog mode (dataguard is a log shipping technology, so this is obvious) and we must enable force logging, so we can transport also the changes executed with “nologging”

SQL> shutdown immediate;
SQL> startup mount;
SQL> alter database archivelog;
SQL> alter database force logging;
SQL> alter database open;
  • add standby redo logs

for those who are asking what the hell standby redo logs are – here a short explanation: the primary database is transferring the redo information to the standby via… this means there must be an oracle instance receiving this information and store it somewhere. And because oracle wants to separate standby and redo information, there is a new log type called standby log. The easiest way is to create it on the primary site – when the primary becomes a standby, the standby logs are already there… and if the logs are present before duplicating the database, they will be created automatically during the copy 😉 So we are creating one more standby logs than we have redo logs – this is because we have one extra source that could copy data from the primary: there can be a stream from each redo log group and one directly from the log buffer. For further information you may want to watch this site…

SQL> select count(*) from v$log;

SQL> select distinct bytes from v$log;

SQL> alter database add standby logfile group 11 size 209715200;
SQL> alter database add standby logfile group 12 size 209715200;
SQL> alter database add standby logfile group 13 size 209715200;
SQL> alter database add standby logfile group 14 size 209715200;
  • enable standby_file_management

often forgotten, but in my opinion an important parameter. The default of this parameter is “manual”, which means when adding a datafile on the primary site, you have to do the same on the standby site. If you forget to do so, the replication will stop… and we are humans and humans do forget things sometimes. So just set the parameter to “auto” and oracle will take care of it.

  • modify listener.ora

OK, now we need to modify the listener.ora file – oracle dataguard need access to the database via under any circumstances, even when the database is down. To solve this problem, we have to add a static entry to the listener.ora…

vi $GRID_HOME/network/admin/listener.ora
      (ADDRESS = (PROTOCOL = TCP)(HOST = primary)(PORT = 1521))

ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON              # line added by Agent

# add static listener entry starts here:
    (SID_DESC =
      (SID_NAME = ORCL)
      (ORACLE_HOME = /u00/app/oracle/product/12.2.0/db_1)

the same must be done on the standby site, too – only change the “GLOBAL_DBNAME” to a proper value – the “_DGMGRL” extension is mandatory for oracle dataguard… regarding the db_unique_name: I usually use the db_name with the suffix RA (for “recovery site A”), RB, RC, etc… and finaly restart or reload the listener to enable the new configuration.

  • init and password file

now we are doing the last few steps of preparation… to start the standby database, we need an pfile… so we a just creating one from the primary:

SQL> create pfile='/tmp/initORCL.ora' from spfile;

to be able to logon as sys via, we need the password file on the standby site… so we just copy the file together with the pfile to the other server:

BASH> scp /tmp/init${ORACLE_SID}.ora standby:/${ORACLE_HOME}/dbs/
BASH> scp ${ORACLE_HOME}/dbs/orapw${ORACLE_SID} standby:/${ORACLE_HOME}/dbs/

And now just edit the pfile and replace all values containing the db_unique_name with the now value.

# do on standby site!
BASH> sed -i 's/ORCLRA/ORCLRB/g' ${ORACLE_HOME}/dbs/init${ORACLE_SID}.ora

to startup the instance, you have to create the audit file destination manually

BASH> grep audit_file_dest ${ORACLE_HOME}/dbs/init${ORACLE_SID}.ora
BASH> mkdir -p /u00/app/oracle/admin/ORCLRB/adump
  • create spfile

Oracle dataguard requires an spfile, so we have to create it…

BASH> echo "ORCL:/u00/app/oracle/product/12.2.0/db_1:N" >> /etc/oratab
BASH> . oraenv <<< ORCL
SQL> startup nomount;
SQL> create spfile='+DATA/ORCLRB/spfileORCL.ora' from pfile;
SQL> shutdown;
BASH> mv ${ORACLE_HOME}/dbs/init${ORACLE_SID}.ora ${ORACLE_HOME}/dbs/init${ORACLE_SID}.ora.bak
BASH> echo "spfile='+DATA/ORCLRB/spfileORCL.ora'&amp;amp;quot; ${ORACLE_HOME}/dbs/init${ORACLE_SID}.ora
SQL> startup nomount;
SQL> show parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- -----------------------------------------------
spfile                               string      +DATA/ORCLRB/spfileorcl.ora
  • final checks before copy

ok, now we have got a primary database in archivelog mode with force logging enabled and a standby instance in nomount state with the same password file, using an spfile. Furthermore, both instances are accessable via… so lets check if it’s working!
the easiest way is just to logon to the database via sqlplus:

sqlplus sys/oracle@primary:1521/orclra_dgmgrl as sysdba
sqlplus sys/oracle@standby:1521/orclrb_dgmgrl as sysdba

if both are working, we can continue, otherwise check the password file and listener entries again. Also checking the sqlnet.ora if ezconnect is allowed could be a good idea.

  • duplicate

the next step is the easiest one, although it might take a while, depending on the size of your database… we do an rman duplicate:

rman target sys/oracle@primary:1521/orclra_dgmgrl auxiliary sys/oracle@standby:1521/orclrb_dgmgrl
RMAN> duplicate target database for standby from active database;

when done you should also add the database to the clusterware, so the database will be started automatically after a reboot etc

BASH> srvctl add database -db ORCLRB -oraclehome /u00/app/oracle/product/12.2.0/db_1 -role physical_standby -instance ORCL -startoption MOUNT
  • enable log shipping

ok, and now the last step: configure the dataguard broker. First we have to enable the broker on both sites…

SQL> alter system set dg_broker_start=true;

and now (maybe wait a few seconds until the broker process is really up) we have to configure it… if you once understand the syntax, it’s really a no brainer.

BASH> dgmgrl /
DGMGRL> create configuration 'DG' as primary database is 'ORCLRA' connect identifier is '';
DGMGRL> add database 'ORCLRB' as connect identifier is '';
DGMGRL> enable configuration;

thats it… you can check your success via the broker (check the lines “Transport LAG” and “Apply Lag”) or by checking the alert.log of the standby database (just check if the media recovery process is waiting for an archivelog in transit)… suspicious people may do both 😉

DGMGRL> show database 'ORCLRB'
Database - ORCLRB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 4.00 KByte/s
  Real Time Query:    OFF

BASH> tail /u00/app/oracle/diag/rdbms/orclrb/ORCL/trace/alert_ORCL.log
Media Recovery Waiting for thread 1 sequence 15 (in transit)
Recovery of Online Redo Log: Thread 1 Group 12 Seq 15 Reading mem 0
  Mem# 0: +DATA/ORCLRB/ONLINELOG/group_12.274.986057281
  Mem# 1: +DATA/ORCLRB/ONLINELOG/group_12.275.986057283

just tell me if I have missed something…

so far

Posted in Uncategorized | Tagged , | Leave a comment

fast provisioning with snapshot clones

Every DBA know’s the situations, where your boss / customer ask you to copy fresh data into the test database… since this blog is about oracle databases, we are solving this issue usually either with “datapump” or “rman duplicate” – and there is nothing wrong about it. But now imagine, we are talking about more than 400 test environments. To satisfy all testers, the production is exported every day (although not all test databases are refreshed daily), which takes several hours every day. And then there are a lot of costs for the expensive SAN storage… So i was asked if there is a better, faster and cheaper solution. As you may suppose, after some research I found the 12c new feature of doing snapshot clones… and it’s really nice, so I want to show you how it works.


  • really fast provisioning (a few seconds)
  • very efficient storage usage


  • it’s based on pluggable database, so you need the multitenant option license
  • only 3 storage types/vendors are certified until now: ZFS, ACFS and NetApp

since i do not have a netapp or zfs applicance, I’m using ACFS for my test, but there are also some things to consider:

  • it’s slower than ASM
  • you have to install a RAC – oracle restart is not sufficient… nevertheless it’s OK to run a 1 node “cluster” where you don’t want to buy the RAC option.

OK, lets start… first – as mentioned above – we need a rac. If you don’t know how to install, i recommend to visit this site: rack attack.
Next we are going to create an ACFS volume… this can be done easily with asmcmd:

bash> asmcmd volcreate -G <diskgroup> -s 10G <volume_name>

of course you have to replace the variables in the brackets… in my test environment I created a dedicated diskgroup called ACFS and created a volume named oradata.
then we have to check, which device name was reserved for this new volume

bash> asmcmd volinfo -G ACFS oradata
Diskgroup Name: ACFS

         Volume Name: ORADATA
         Volume Device: /dev/asm/oradata-30
         State: ENABLED

Great! now we have to crate a filesystem on that virtual block device

mkfs.acfs /dev/asm/oradata-30

and last step: mount the volume

sudo /sbin/acfsutil registry -a /dev/asm/oradata-30 /u00/app/oracle/oradata

that’s it – now the clusterware should detect the new entry in the OCR (the check is running every 30 seconds) and automount the volume… if not, you can do it manually:

sudo mkdir /u00/app/oracle/oradata
sudo /bin/mount -t acfs /dev/asm/oradata-30 /u00/app/oracle/oradata

next chapter: create the database. Since i want to simulate the complete way, I created two databases: one is the source (=production) called ORCL, the other the target (=test) database named CDB. Both databases are located on ASM disksgroups (not the ACFS volume!). The source is installed the classical way (no multi- or singletenant) while the target is installed as CDB but without any PDB.
To use the PDB-clone-feature, we have to create a database link from the target to the source first:

SYS@CDB1>CREATE DATABASE LINK orcl CONNECT TO system IDENTIFIED BY <password> USING 'rac01-scan:1521/orcl';

I just used the user “system”, but any other user with the “create session” and “create pluggable database” privilege is sufficient.
next we have to prepare the directory and create the master pdb

bash> mkdir -p /u00/app/oracle/oradata/CDB/prod_seed
sql>  CREATE PLUGGABLE DATABASE prod_seed FROM NON$CDB@orcl CREATE_FILE_DEST='/u00/app/oracle/oradata/CDB/prod_seed';

based on the size of the database and the speed of your storage, this will take some minutes…
when finished, you have a seed pdb with the interal structure of a non-cdb… so you have to convert it first.

SQL> @$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql

please note that you have to open the database first to make some kind of instance recovery, afterwards you can bring the pdb in the correct state to be a seed database.
And now: create the clone

bash> mkdir -p /u00/app/oracle/oradata/CDB/clone1
SQL> CREATE PLUGGABLE DATABASE clone1 FROM prod_seed CREATE_FILE_DEST='/u00/app/oracle/oradata/CDB/clone1' SNAPSHOT COPY;

P.S. in some guidelines you will be advised to set the parameter clonedb to true – don’t do, at least if you are using ACFS! it’s only necessary if your storage can’t do snapshots, so oracle is trying to do some magic, don’t know if its working. For further details, just read the manual (search for “SNAPSHOT COPY”)
OK, and now? Finished! … and where is the magic? the last ‘create pluggable database…’ just took a few seconds, regardless the size of the seed database, and it consumes no space at all (at least at the moment)…

bash> du -sh *
228K    clone1
3.6G    prod_seed

and what’s inside?

ls -lhr clone1/CDB/72C5EDFE41727F36E0531F01100AB18B/datafile/
total 20K
lrwxrwxrwx. 1 oracle dba 154 Aug  6 16:17 o1_mf_users_fpjp2wdt_.dbf -> /u00/app/oracle/oradata/.ACFS/snaps/72C5EDFE41727F36E0531F01100AB18B/CDB/prod_seed/CDB/72C340E0CEE355E1E0531F01100AFEB9/datafile/o1_mf_users_fpjbvo7t_.dbf
lrwxrwxrwx. 1 oracle dba 157 Aug  6 16:17 o1_mf_undotbs2_fpjp2wdg_.dbf -> /u00/app/oracle/oradata/.ACFS/snaps/72C5EDFE41727F36E0531F01100AB18B/CDB/prod_seed/CDB/72C340E0CEE355E1E0531F01100AFEB9/datafile/o1_mf_undotbs2_fpjbvo7g_.dbf
lrwxrwxrwx. 1 oracle dba 157 Aug  6 16:17 o1_mf_undotbs1_fpjp2wco_.dbf -> /u00/app/oracle/oradata/.ACFS/snaps/72C5EDFE41727F36E0531F01100AB18B/CDB/prod_seed/CDB/72C340E0CEE355E1E0531F01100AFEB9/datafile/o1_mf_undotbs1_fpjbvo4c_.dbf
lrwxrwxrwx. 1 oracle dba 155 Aug  6 16:17 o1_mf_system_fpjp2w7z_.dbf -> /u00/app/oracle/oradata/.ACFS/snaps/72C5EDFE41727F36E0531F01100AB18B/CDB/prod_seed/CDB/72C340E0CEE355E1E0531F01100AFEB9/datafile/o1_mf_system_fpjbvnyh_.dbf
lrwxrwxrwx. 1 oracle dba 155 Aug  6 16:17 o1_mf_sysaux_fpjp2wc4_.dbf -> /u00/app/oracle/oradata/.ACFS/snaps/72C5EDFE41727F36E0531F01100AB18B/CDB/prod_seed/CDB/72C340E0CEE355E1E0531F01100AFEB9/datafile/o1_mf_sysaux_fpjbvo3z_.dbf

so oracle “tells” the storage (in this case the ACFS driver) to do a snapshot of these data files (and only these files, if there are other databases on this acfs volume they won’t be inside the snapshot), afterwards these snapshots are linked to the location where the data files of the new clone database should be – thats it… very easy and the dba has not to worry about storage snapshots, it’s done automatically. If you drop the clone the snapshot will also be deleted automatically.

just tell me if you have any questions…


Posted in Uncategorized | Tagged , , , , | Leave a comment

Performance drilldown

well, every dba might now these kind of phone calls, where a user is complaining about poor performance of the database… but if you ask questions like when or what part of the application is slow, you don’t get useful answers.
so you have to play “detective” and find the guy who murdered the performance 😉
But where to start? You might generated some AWR reports and try to find something nasty – but in my opinion this is like finding a needle in a haystack, you need something covering a metric over time… my advise: use DBA_HIST_SYSMETRIC_SUMMARY (as long as you have the proper license!)

if you are in a hurry and need something handy now, you may use this:

set linesize 200
col "CPU USAGE %" for a12 heading "CPU|USAGE %"
col "LOGICAL READ IO/s" for a12 heading "LOGICAL|READ IO/s"
col "PHYSICAL READ IO/s" for a12 heading "PHYSICAL|READ IO/s"
col "PHYSICAL READ MB/s" for a12 heading "PHYSICAL|READ MB/s"
col "PHYSICAL WRITE IO/s" for a12 heading "PHYSICAL|WRITE IO/s"
col "PHYSICAL WRITE MB/s" for a12 heading "PHYSICAL|WRITE MB/s"
col "REDO MB/s" for a12 heading "REDO|MB/s"
col "BUFFER HIT RATIO" for a12 heading "BUFFER|HIT RATIO"
   ,to_char(begin_interval_time, 'YYYY-MM-DD HH24:MI') "DATE"
   ,round(cpu_value,2) "CPU USAGE %"
   ,round(logical_read_value) "LOGICAL READ IO/s"
   ,round(physical_read_value) "PHYSICAL READ IO/s"
   ,round(physical_read_byte_value/1024/1024) "PHYSICAL READ MB/s"
   ,round(physical_write_value) "PHYSICAL WRITE IO/s"
   ,round(physical_write_byte_value/1024/1024) "PHYSICAL WRITE MB/s"
   ,round(redo_byte_value/1024/1024,2) "REDO MB/s"
   ,round(100-(physical_read_value/logical_read_value*100),2) "BUFFER HIT RATIO"
from (
   from dba_hist_sysmetric_summary a
   inner join dba_hist_snapshot b on a.snap_id = b.snap_id
pivot (sum(average) as value for (metric_name) in (
          'Logical Reads Per Sec' as logical_read
         ,'Physical Reads Per Sec' as physical_read
         ,'Physical Writes Per Sec' as physical_write
         ,'Physical Read Total Bytes Per Sec' as physical_read_byte
         ,'Physical Write Total Bytes Per Sec' as physical_write_byte
         ,'Redo Generated Per Sec' as redo_byte
         ,'Host CPU Utilization (%)' as cpu))
order by snap_id;

Keep in Mind: this SQL gives you the average values between two snapshots, the dba_hist_sysmetric_summary view is also able to provide you the peak values…

so far

Posted in Uncategorized | Tagged , | Leave a comment

recover standby database

well, oracle dataguard in general is rock solid (if configured well – but you will find a lot of guides in the internet how to do so)… but as always, sometimes something is not working as it should and the standby is out of sync afterwards (i.e. a network outage between the servers, and to avoid archiver stuck on the primary site you are forcing a delete of the archive logs).

OK, the network is working again, so how to solve the gap? If only a few archivelogs are missing, my first choice would be a restore of the archivelogs from backup. But what if thousends of archivelogs are missing? restore them all from tape? Based on my experiences, this could be a very time consuming procedure, in worst case the tape drive is not fast enough to catch up with the primary. In this case you may consider rebuilding the standby (rman> duplicate database for standby from active database;) … have you ever tried to do this with a 50 TB database? The solution might be an incremental recovery! If you consider some peculiarities, it’s not so complicated…

  1. stop redo apply (it’s not working anyway)

you cannot apply an incremental backup while the apply is running, so let’s stop it:

dgmgrl /
edit database 'ORCLRB' set state='APPLY-OFF';
  1. evaluate what’s missing

Then we have to figure out where our incremental backup shoud start… the easy way is to query the v$database view to get the current scn – but this only returns the information stored in the controlfile, sometimes the headers of the datafiles do have a lower scn and rman refuses to start the recovery. so the sql must be like this:

# execute on standby!
sqlplus / as sysdba
set linesize 200
   select current_scn SCN from V$DATABASE
   union all
   select min(checkpoint_change#) from v$datafile_header 
   where file# not in (select file# from v$datafile where enabled = 'READ ONLY')
  1. start backup

well, just execute the output of the above sql on the primary database… and when it is finished, you also need a fresh standby controlfile:

# execute on primary!
sqlplus / as sysdba
  1. start recover

First you have to make the backup availeble on the standby site… either mount as NFS device or copy the backup files via scp to the standby site. Then catalog the files and start recovery…

# execute on standby!
rman target /
catalog start with '/tmp/recover_standby';
recover database noredo;
  1. restore controlfile

now we have to “update” the controlfile… since this is not possible, we are doing a full restore.

# execute on standby!
rman target /
shutdown immediate;
startup nomount;
restore controlfile from '/tmp/stby.ctl';
alter database mount;

if the structure of the database is not exactly the same as on the primary site, you have to populate the the new file and folder names:

rman target /
catalog start with '+U01/ORCLRB/DATAFILE';
switch database to copy;
  1. enable redo apply

almost finished, now we only have to enable the dataguard process again…

dgmgrl /
edit database 'ORCLRB' set state='APPLY-ON';

done… just check the alert.log if the standby is receiving and applying the last few archivelogs.

Posted in Uncategorized | Tagged , | Leave a comment

auditing – what the hell is going on in my database

When I came into the office today, a colleague of mine was already waiting for me – one of our daily datapump exports has failed again. When looking into the log, I found this error message:

ORA-01466: unable to read data - table definition has changed

ok, this shouldn’t happen since the application is not doing ddl (hopefully), so who else could it be? But from ancient times I could remember that oracle is auditing several activities… but when I started google to get the name of the audit table, I was confused: it seems in Oracle 12c there are 2 different types auf auditing.

  1. old style
  2. ok, this is maybe the method you already know… for auditing all ddl statements on all tables, you just have to enter

    audit table

    … very simple – but boring 😉

  3. new style
  4. the new style is a bit more complex – you have to define a policy first and then activate it. I’m no audit expert, but as far as I could gather the information, the new style is very powerful and can be configured more accurate, but some features of the old method are missing.

Ok, but back to my problem… to make a long story short, by default ddl statements are not captured but could be enabled with both methods. So i decided to use the new way…

so I first checked what is already in my audit table…

SELECT * FROM unified_audit_trail WHERE event_timestamp > sysdate-1 ;

… and waited. I found several posts complaining about slow performance of the audit table when there are too many entries. So first step is a cleanup of the current table. you can do it manually, but i decided to create a scheduler job doing it regularly… and when I’m doing a cleanup job, I’m going to cleanup also the old style audit logs… so after some try and error my job is looking like this:

       job_name        => 'PURGE_ALL_AUDIT_LOGS'
      ,start_date      => sysdate
      ,repeat_interval => 'freq=daily; byhour=1; byminute=0; bysecond=0'
      ,job_type        => 'PLSQL_BLOCK'
      ,job_action      => 'BEGIN

first we have to initialize the cleanup… this is not necessary for the new unfied audit logs, but for the old style. then we have to define the retention time for all audit types seperately, and last but not least do the purge itself… easy, isn’t it? After running the job once I was able to query the audit table… ok, next step: implement DDL auditing.
First of all you should know that the audit rules are read during the login… if you are wondering why the audit is not working: try logout and login again 😉 I have no idea if shared server processes are different…
So, back to the audit policy… in general there are 3 different methods to define when a command should be logged.

  1. defining the used privileges
  2. when someone is not owning a table but is able to read or modify it because of special system privileges (like ‘select any table’)

  3. defining actions
  4. you can define the action that should be logged. An action could be ‘create table’ or ‘update table’

  5. defining the used role
  6. when the right to do a special operation is permitted via a role, you can just specify it.

Since I want to audit all DDL regardless why the user is allowed to do it, my policy is like this:


if you want to narrow it a bit, you can specify the object that should be monitored…

CREATE audit policy test ACTIONS delete on hr.employees;

And last but not least it’s possible to add custom filter criteria…

CREATE audit policy test ACTIONS delete on hr.employees WHEN 'SYS_CONTEXT(''USERENV'', ''USERNAME'') != ''HR''';

Finaly you have to enable the policy!

audit policy ddl;

thats it… hope I could give you a simple and easy to understand introduction into the auditing feature of oracle.


Posted in Uncategorized | Leave a comment

how to use SQL tuning advisor with SQL*Plus

because I’ve worked on a lot of environments where I only have ssh access, i cannot use graphical tools like Toad or SQL Developer… so I have to find a suitable way to tune sql statements with sql*plus… I’m just describing how to use the sql tuning advisor, how to find the correct sql_id will be another artikel.

TAKE CARE: you need the oracle diagnostic & tuning pack license!

ok, not the technical part… there are 3 steps: create a tuning task, where you define what oracle should do, execute the tuning task and show the result of the tuning task. Lets start with #1

define sql_id=abcdefghijkl
define begin_snap=1234
define end_snap=1235

  l_sql_tune_task_id  VARCHAR2(100);
  l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (                   
                          begin_snap  => &begin_snap,
                          end_snap    => &end_snap,
                          sql_id      => '&sql_id',
                          scope       => DBMS_SQLTUNE.scope_comprehensive,
                          time_limit  => 7200, -- max runtime in seconds... so this means 2 hours
                          task_name   => '&sql_id._AWR_tuning_task');
  DBMS_OUTPUT.put_line('l_sql_tune_task_id: ' || l_sql_tune_task_id);

ok, now you have to execute the tuning task

EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => '&sql_id._AWR_tuning_task');

and when the task has finished, you can view the report with this sql statement:

SET LONG 10000;
column RECOMMENDATIONS format a200
-- optionally you can spool the output
-- spool sql_advisor_&sql_id._AWR_tuning_task.txt
SELECT DBMS_SQLTUNE.report_tuning_task('&sql_id._AWR_tuning_task') AS recommendations FROM dual;
-- spool off

if you want to accept a sql profile, the syntax to do so is withing the report, but are always the same…

exec dbms_sqltune.accept_sql_profile(task_name => '&sql_id._AWR_tuning_task', replace => TRUE);

if you want to undo (=remove) the sql profile, you just have to execute this:

exec DBMS_SQLTUNE.drop_sql_profile (name => '&sql_id._AWR_tuning_task');

you can drop the tuning task including the report (but without the profile) with this:

exec dbms_sqltune.drop_tuning_task(task_name => '&sql_id._AWR_tuning_task');

if you want to see the which sql tuning tasks exist, you can query the related dba view…

select task_name from DBA_ADVISOR_TASKS where advisor_name='SQL Tuning Advisor';

small hint: The automatic sql tuning advisor is running every night (if you haven’t disabled it). Internally it’s just a regular sql tuning task called ‘SYS_AUTO_SQL_TUNING_TASK’… you can get the report as described above (you just have to replace the task name) and see the recommendations of all automatically analysed sql statements within your AWR retention time… pretty cool to get an overview of slow sql.

Posted in Uncategorized | 1 Comment

creating a blog the nerd way – Part 2

part 1 is here

ok, now we have a clustered mysql (or better mariaDB) database running, next step is to setup the blog itself…

  • Web Server

to serve the blog, we have to install the webserver first – i decided to use the de-facto standard apache2. on ubunto it’s very easy to install

sudo apt-get install apache2 php php-mysql

done… of course you can now modify the configuration etc – but for my purpose the default is sufficient.

  • Storage for the blog

since i want to distribute all uploaded files and the blog binaries itself to both servers, i need either a shared volume(i.e. nfs from my nas) or a cluster file system… since i’m a nerd and nfs is too easy, I was seeking for a cluster solution – and found Gluster

the installation is again easier than i thought it might be…

sudo apt-get install software-properties-common
sudo add-apt-repository ppa:gluster/glusterfs-3.8
sudo apt-get update
sudo apt-get install -y glusterfs-server glusterfs-client
sudo service glusterfs-server start

of course you have to do the installation on both servers… afterwards, you only have to create the volumes to be clustered. Therefore I have created the mountpoint /data on dedicated disks (one per server) – since there are thousands of tutorials describing how to add and partition a new disk I won’t repeat it here.

# on both servers
sudo mkdir -p /data/gluster/gvol0

# on one node only
sudo gluster peer probe <SERVER2>
sudo gluster pool list
sudo gluster volume create gvol0 replica 2 <SERVER1>:/data/gluster/gvol0 <SERVER2>:/data/gluster/gvol0
sudo gluster volume start gvol0
sudo gluster volume info gvol0

the last command should tell you that everything is fine. Now we just have to mount the gluster volume:

# on both nodes
sudo mkdir -p /var/www/blog
sudo mount -t glusterfs localhost:/gvol0 /var/www/blog

as you can see, we are mounting a device with file type glusterfs – this means the gluster client is connecting to the gluster server instead of directly accessing the local disk. In this way it’s guaranteed that we get the current data, event if it’s not distributed to all servers.
To make it permanent you can add the mount command to the /etc/fstab file, but be aware: the mountpoint /data must alreade be mounted and the glusterfs deamon must be running… so I recommend to configure it in the new systemctl config files where you can configure dependencies or define the upstart order in /etc/rc.local file.

  • install wordpress

first we have to add the directory known by apache

sudo chown www-data:www-data /var/www/blog
echo "<VirtualHost *:80>" > /etc/apache2/sites-available/blog.conf
echo "DocumentRoot /var/www/blog" >> /etc/apache2/sites-available/blog.conf
echo "ServerName <YOUR PUBLIC FQDN>" >> /etc/apache2/sites-available/blog.conf
echo "</VirtualHost>" >> /etc/apache2/sites-available/blog.conf
a2ensite blog
service apache2 reload
unzip -d /var/www/blog/
mv /var/www/blog/wordpress/* /var/www/blog/
rm -rf /var/www/blog/wordpress/

now you just have to open a browser on your client and navigate to the FQDN you just entered in your apache config… I assume that you already configured your firewall. If you don’t want to open the firewall yet, it’s also possible to modify your /etc/hosts (*nix) or %windir%/system32/drivers/etc/hosts (win) file and enter the FQDN and private ip adress of one of your servers. The rest of the setup process is easy and not worth describing it here.

  • automatic failover

nearly finished – the blog itself is running and the content is distributed on two independent servers… now we only have to establish some kind of failover or load balancing mechanism to make sure it’s always on. since the goal is always on and i do not expect thousends of users a day, I just searched for an IP based HA solution – and found UCARP
just install the packages and add a few lines to the network configuration – done

sudo apt-get install ucarp

and now just add some ucarp specific settings to the network configuration file /etc/network/interfaces

# ucarp configuration
ucarp-vid 1 # vid : The ID of the virtual server [1-255]
ucarp-vip # vip : The virtual address
ucarp-password secret # password : A password used to encrypt Carp communications
ucarp-advskew 1 # advskew : Advertisement skew [1-255]
ucarp-advbase 1 # advbase : Interval in seconds that advertisements will occur
ucarp-master yes # master : determine if this server is the master. Must be "no" on 2nd server!

iface ens32:ucarp inet static

that’s it, now just restart the network and configure your firewall to forward to the virtual ip…

sudo service networking restart

done… ok, there are some other tasks you might do, i.e. secure the blog with an ssl certificate. Maybe I’m going to make another blog post describing how to do it 😉

If you have any questions or suggestions, just leave a note…

Posted in Uncategorized | Leave a comment

creating a blog the nerd way – Part 1

as you may already mentioned, I’ve got a blog (hey, you are here ^^)… creating a blog sounds very easy today – there are lots of websites offering free blogs, you just have to choose. But if you are a nerd like me, you may have a bit more to do…

  • Design
  • In my daily work, high availebility is one of the main topics… and although I wouldn’t call this blog important enough to be always on… well I’m a nerd, so I’m going to cluster everything 😉

  • Hardware
  • I already run two small Servers with VMware, so I just created two new virtual Machines.

  • OS
  • I have choosen the latest Ubuntu LTS… the setup is well documented, so I’m not going to repeat it here.

  • Database
  • now the funny part begins 😉 If i had the choice i would have installed an oracle DB – but unfortunately it’s not for free. So I’m going for the 2nd best solution – mysql. First i had to decide which cluster technologie to use – either the “master slave replication” or “galera cluster”… translated to oracle terms it means do dataguard or RAC. OK, Galera is not like RAC because it’s a shared nothing architecture! Sounds pretty cool, so Galera, prepare yourself to be used by me 😛
    Since Galera works best with MarianDB, I made a small adjustment regarding the used product, but MariaDB is a fork of mysql and almost all products able to run on mysql should work on MariaDB (as you can see, wordpress works fine ^^) And just to confirm my decision, I found the announcement that MariaDB 10.1 and higher are including the galera binaries… perkt. So you only have to add the MariaDB repository to the packet management and just run the installer. And to make it easy, on the project website is a very good documentation what commands to execute depending on your OS and designated db version…
    so i just entered

    apt install mariadb-server

    on both servers and had two running MariaDB databases running… but wait – I don’t want two databases but just one cluster! But everything is working as it should, unlike Oracle you are linking the databases afterwards… sounds wired, but it works.
    so we just have to create a galera configuration

    vi /etc/mysql/conf.d/galera.cnf
    # Galera Provider Configuration
    # Galera Cluster Configuration
    wsrep_cluster_address="gcomm://<SERVER 1>,<SERVER 2>"
    # Galera Synchronization Configuration
    # Galera Node Configuration
    wsrep_node_address="<LOCAL IP>"
    wsrep_node_name="<LOCAL HOSTNAME>"

    of course you have to replace the values in the brackets with something suiteble for your environment. If you are using names for “wsrep_cluster_address”, make sure you can resolve the names – otherwiese use IP’s.
    now we have to stop mysql and create a new cluster…

    systemctl stop mysql  # on all Nodes!
    galera_new_cluster    # on first node only!
    systemctl stop mysql && systemctl start mysql

    thants it!
    to check if all is working you can execute this SQL:

    mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

    the output should be the number of cluster members – and should of course work on all nodes.
    if you want some more details – i found this site with a step by step guide and a lot of explanations.

    … will be continued

    Posted in Uncategorized | 1 Comment

    alter table … shrink space – and its progress

    well, there are a lot of good tutorials regarding when and how to use the “alter table … shrink space” command – and don’t panic, I won’t do another one 😉

    just a few words to mechanism behind: the shrink is done in 2 steps

    1. the rows from the “end” of a table will be moved to the “beginning”. Internally, this is done by “select – insert – delete” statements, row by row. So there are only locks on row level and a lot of small transactions. Therefore, this part is completely online and has just a small performance impact on running transactions (I’ve seen somewhat about 2% and 10%, depending on your application)
    2. The “high water mark” will be reset to the latest block of the table. Oracle has to lock the table exclusively – so it’s not really online.

    and a short summary about the different “shrink space” commands (because I always forget them myself)

    • alter table … shrinc space:
      • do both steps successive
    • alter table … shrink space cascade:
      • do both steps successive for the table and all related objects like indexes and lobs
      • if you want to see all related objects, just execute this:
        select * from (table(DBMS_SPACE.OBJECT_DEPENDENT_SEGMENTS('<OWNER>','<TABLE>',null, 1)));
    • alter table … shrink space compact:
      • do only step 1
    • alter table … modify lob (…) (shrink space):
      • do step 1 & 2 for lob column

    And now: the tricky part… how to minitor progress… and just the short answer at the beginning: it’s not possible. If you think of using v$session_longops – as I mentioned above, there many short transactions…

    but there are some other indications you can use to check progress, although you can’t calculate the end time properly… nevertheless it’s enough to see the process is ongoing 😉

      ,round(d.read_value/1024/1024/1024,2) read_gb
      ,round(d.write_value/1024/1024/1024,2) write_gb
      ,round(d.undo_value/1024/1024/1024,2) undo_gb
    from v$session_wait a
    join v$session b on a.sid = b.sid
    join v$sqlarea c on b.SQL_ID = c.SQL_ID
    join (select * from (select s.sid,, s.value from v$sesstat s inner join v$statname t on s.statistic#=t.statistic#)
                  pivot (sum(value) as value for name in (
                     'undo change vector size' UNDO
                    ,'physical read total bytes' READ
                    ,'physical write total bytes' WRITE
    ))) d on a.sid=d.sid
    join (select b.sid, sum(a.USED_UBLK) current_undo from V$TRANSACTION a join v$session b on a.SES_ADDR=b.saddr group by b.sid) e on e.sid=a.sid
    where upper(c.sql_text) like 'ALTER TABLE%SHRINK SPACE%'
    and b.sid != (select sys_context('USERENV','SID') from dual)

    Just to explain the output:

    • SID: the user SID executing the shrink command
    • EVENT: the current wait event… until its “direct path read” or “db file … read” you are in step 1
    • WAIT_TIME: should be nearly 0, since we have just small transactions.
    • READ_GB: amount of data the session has read vom disk
    • WRITE_GB: amount of data the session has written to disk
    • UNDO_GB: amount of data the session has written to undo
    • CURRENT_UNDO: number of undo blocks currently used… should be very low (<20) – see “small transactions”
    • SQL_TEXT: the SQL text we are monitoring…
    Posted in Uncategorized | Tagged | Leave a comment

    ORA-01652: Who is using the temp tablespace?

    if you are getting the error “ORA-01652: unable to extend temp segment by 128 in tablespace TEMP” you could just increase the temp tablespace, i.e. by adding an additional tempfile… the smarter way would be to check who is consuming so much temp tablespace…

    The following SQL is giving you a live view:

    	,round(su.blocks * tbs.block_size / 1024 / 1024 /1024) gb_used 
    	--,sa.sql_text     -- uncomment if you want to see the SQL text
    	--,sa.sql_fulltext -- uncomment if your Client can handle CLOB's (i.e. SQL Developer)
    FROM  v$sort_usage su   
    inner join v$session s on su.session_addr=s.saddr
    inner join dba_tablespaces tbs on su.tablespace=tbs.tablespace_name
    inner join v$sqlarea sa on s.sql_id=sa.sql_id
    where su.blocks*tbs.block_size > 1024*1024*1024 -- more than 1 GB TEMP
    ORDER BY gb_used desc; 

    and if you have to check what happende some time ago (take care: Diagnosting & Tuning pack required!)

      ,round(temp_space_allocated/1024/1024/1024) gb_used
      FROM dba_hist_active_sess_history
    WHERE temp_space_allocated > 1024*1024*1024
    ORDER BY sample_time;
    Posted in Uncategorized | Tagged | Leave a comment